This is a pure Dart html5 parser. It's a port of html5lib from Python. Since it's 100% Dart you can use it safely from a script or server side app.
Eventually the parse tree API will be compatible with dart:html, so the same code will work on the client or the server.
Add this to your pubspec.yaml (or create it):
dependencies:
html5lib: any
Then run the Pub Package Manager (comes with the Dart SDK):
pub install
Parsing HTML is easy!
import 'package:html5lib/parser.dart' show parse;
import 'package:html5lib/dom.dart';
main() {
var document = parse(
'<body>Hello world! <a href="www.html5rocks.com">HTML5 rocks!');
print(document.outerHtml);
}
You can pass a String or list of bytes to parse.
There's also parseFragment for parsing a document fragment, and HtmlParser
if you want more low level control.
You can upgrade the library with:
pub update
Disclaimer: the APIs are not finished. Updating may break your code. If that happens, you can check the commit log, to figure out what the change was.
If you want to avoid breakage, you can also put the version constraint in your
pubspec.yaml in place of the word any.
Right now the tokenizer, html5parser, and simpletree are working.
These files from the html5lib directory still need to be ported:
ihatexml.pysanitizer.pyfilters/*serializer/*treebuilders/*treewalkers/*tests corresponding to the above filesAll tests should be passing.
# Make sure dependencies are installed
pub install
# Run command line tests
#export DART_SDK=path/to/dart/sdk
test/run.sh
Add this to your package's pubspec.yaml file:
dependencies: html5lib: 0.4.3
If your package is an
application package you should use any
as the
version constraint.
If you're using the Dart Editor, choose:
Menu > Tools > Pub Install
Or if you want to install from the command line, run:
$ pub install
Now in your Dart code, you can use:
import 'package:html5lib/dom.dart'; import 'package:html5lib/ dom_parsing.dart'; import 'package:html5lib/ parser.dart'; import 'package:html5lib/ parser_console.dart';
| Version | Uploaded | Archive |
|---|---|---|
| 0.4.3 | Apr 16, 2013 | Download html5lib 0.4.3 archive |
| 0.4.2 | Apr 13, 2013 | Download html5lib 0.4.2 archive |
| 0.4.1 | Mar 16, 2013 | Download html5lib 0.4.1 archive |
| 0.4.0 | Mar 09, 2013 | Download html5lib 0.4.0 archive |
| 0.3.3+2 | Feb 27, 2013 | Download html5lib 0.3.3+2 archive |
| 0.3.3+1 | Feb 27, 2013 | Download html5lib 0.3.3+1 archive |
| 0.3.3 | Feb 19, 2013 | Download html5lib 0.3.3 archive |
| 0.3.2+1 | Feb 11, 2013 | Download html5lib 0.3.2+1 archive |
| 0.3.2 | Feb 06, 2013 | Download html5lib 0.3.2 archive |
| 0.3.1+2 | Jan 31, 2013 | Download html5lib 0.3.1+2 archive |