html5 parser in dart

This is a pure Dart html5 parser. It's a port of html5lib from Python. Since it's 100% Dart you can use it safely from a script or server side app.

Eventually the parse tree API will be compatible with dart:html, so the same code will work on the client and the server.

(Formerly known as html5lib.)


Add this to your pubspec.yaml (or create it):

  html: any

Then run the Pub Package Manager (comes with the Dart SDK):

pub install


Parsing HTML is easy!

import 'package:html/parser.dart' show parse;
import 'package:html/dom.dart';

main() {
  var document = parse(
      '<body>Hello world! <a href="">HTML5 rocks!');

You can pass a String or list of bytes to parse. There's also parseFragment for parsing a document fragment, and HtmlParser if you want more low level control.

Running Tests



A simple tree API that results from parsing html. Intended to be compatible with dart:html, but it is missing many types and APIs.
This library contains extra APIs that aren't in the DOM, but are useful when interacting with the parse tree.
This library has a parser for HTML5 documents, that lets you parse HTML easily from a script or server side application: [...]
This library adds dart:io support to the HTML5 parser. Call initDartIOSupport before calling the parse methods and they will accept a RandomAccessFile as input, in addition to the other input types.