linkcheck 2.0.0

  • README.md
  • CHANGELOG.md
  • Installing
  • Versions
  • 44

linkcheck

Build Status

Very fast link-checking.

linkcheck versus the popular blc tool

Philosophy:

A good utility is custom-made for a job. There are many link checkers out there, but none of them seems to be striving for the following set of goals.

Crawls fast

  • You want to run the link-checker at least before every deploy (on CI or manually). When it takes ages, you're less likely to do so.

  • linkcheck is currently several times faster than blc and all other link checkers that go to at least comparable depth. It is 40 times faster than the only tool that goes to the same depth (linkchecker).

Finds all relevant problems

  • No link-checker can guarantee correct results: the web is too flaky for that. But at least the tool should correctly parse the HTML (not just try to guess what's a URL and what isn't) and the CSS (for url(...) links).

    • PENDING: srcset support
  • linkcheck finds more than linklint and blc. It finds the same amount or more problems than the best alternative, linkchecker.

Leaves out irrelevant problems

  • linkcheck doesn't attempt to render JavaScript. It would make it at least an order of magnitude slower and way more complex. (For example, what links and buttons should the tool attempt to click, and how many times? Should we only click visible links? How exactly do we detect broken links?) Validating SPAs is a very different problem than checking static links, and should be approached by dedicated tools.

  • linkcheck only supports http: and https:. It won't try to check FTP or telnet or nntp links.

    • Note: linkcheck will currently completely ignore unsupported schemes like ftp: or mailto: or data:. This may change in the future to at least show info-level warning.
  • linkcheck doesn't validate file system directories. Servers often behave very differently than file systems, so validating links on the file system often leads to both false positives and false negatives. Links should be checked in their natural habitat, and as close to the production environment as possible. You can (and should) run linkcheck on your localhost server, of course.

Good <abbr title="User Experience">UX</abbr>

  • Yes, a command line utility can have good or bad UX. It has mostly to do with giving sane defaults, not forcing users to learn new constructs, not making them type more than needed, and showing concise output.

  • The most frequent use cases should be only a few arguments.

  • linkcheck doesn't throttle itself on localhost.

  • linkcheck follows POSIX CLI standards (no @input and similar constructs like in linklint).

Brief and meaningful output

  • When everything works, you don't want to see a huge list of links.

    • In this scenario, linkcheck just outputs 'Perfect' and some stats on a single line.
  • When things are broken, you want to see where exactly is the problem and you want to have it sorted in a sane way.

    • linkcheck lists broken links by their source URL first so that you can fix many links at once. It also sorts the URLs alphabetically, and shows both the exact location of the link (line:column) and the anchor text (or the tag if it wasn't an anchor).
  • For <abbr title="Continuous Integration">CI</abbr> builds, you want non-zero exit code whenever there is a problem.

    • linkcheck returns status code 1 if there are warnings, and status code 2 if there are errors.

It goes without saying that linkcheck honors robots.txt and throttles itself when accessing websites.

Installation

Step 1. Install Dart

Full installation guides per platform:

For example, on a Mac, assuming you have homebrew, you just run:

$ brew tap dart-lang/dart
$ brew install dart

Step 2. Install linkcheck

Once Dart is installed, run:

$ pub global activate linkcheck

Pub installs executables into ~/.pub-cache/bin, which may not be on your path. You can fix that by adding the following to your shell's config file (.bashrc, .bash_profile, etc.):

export PATH="$PATH":"~/.pub-cache/bin"

Then either restart the terminal or run source ~/.bash_profile (assuming ~/.bash_profile is where you put the PATH export above).

Usage

If in doubt, run linkcheck -h. Here are some examples to get you started.

Localhost

Running linkcheck without arguments will try to crawl http://localhost:8080/ (which is the most common local server URL).

  • linkcheck to crawl the site and ignore external links
  • linkcheck -e to try external links

If you run your local server on http://localhost:4000/, for example, you can do:

  • linkcheck :4000 to crawl the site and ignore external links
  • linkcheck :4000 -e to try external links

linkcheck will not throttle itself when accessing localhost. It will go as fast as possible.

Deployed sites

  • linkcheck www.example.com to crawl www.example.com and ignore external links
  • linkcheck https://www.example.com to start directly on https
  • linkcheck www.example.com www.other.com to crawl both sites and check links between the two (but ignore external links outside those two sites)

Many entry points

Assuming you have a text file mysites.txt like this:

http://egamebook.com/
http://filiph.net/
https://alojz.cz/

You can run linkcheck -i mysites.txt and it will crawl all of them and also check links between them. This is useful for:

  1. Link-checking projects spanning many domains (or subdomains).
  2. Checking all your public websites / blogs / etc.

There's another use for this, and that is when you have a list of inbound links, like this:

http://www.dartlang.org/
http://www.dartlang.org/tools/
http://www.dartlang.org/downloads/

You probably want to make sure you never break your inbound links. For example, if a page changes URL, the previous URL should still work (redirecting to the new page when appropriate).

Where do you get a list of inbound links? Try your site's sitemap.xml as a starting point, and — additionally — try something like the Google Webmaster Tools’ crawl error page.

Skipping URLs

Sometimes, it is legitimate to ignore some failing URLs. This is done via the --skip-file option.

Let's say you're working on a site and a significant portion of it is currently under construction. You can create a file called my_skip_file.txt, for example, and fill it with regular expressions like so:

# Lines starting with a hash are comments.

admin/
\.s?css$
\#info

The file above includes a comment on line 1 which will be ignored. Line 2 is blank and will be ignored as well. Line 3 contains a broad regular expression that will make linkcheck ignore any link to a URL containing admin/ anywhere in it. Line 4 shows that there is full support for regular expressions – it will ignore URLs ending with .css and .scss. Line 5 shows the only special escape sequence. If you need to start your regular expression with a # (which linkcheck would normally parse as a comment) you can precede the # with a backslash (\). This will force linkcheck not to ignore the line. In this case, the regular expression on line 4 will match #info anywhere in the URL.

To use this file, you run linkcheck like this:

linkcheck example.com --skip-file my_skip_file.txt

Regular expressions are hard. If unsure, use the -d option to see what URLs your skip file is ignoring, exactly.

2.0.0

  • First Dart-2-only version.

1.0.6

  • Last version compatible with Dart 1 and Dart 2

Use this package as an executable

1. Install it

You can install the package from the command line:


$ pub global activate linkcheck

2. Use it

The package has the following executables:


$ linkcheck

Use this package as a library

1. Depend on it

Add this to your package's pubspec.yaml file:


dependencies:
  linkcheck: "^2.0.0"

2. Install it

You can install packages from the command line:

with pub:


$ pub get

with Flutter:


$ flutter packages get

Alternatively, your editor might support pub get or flutter packages get. Check the docs for your editor to learn more.

3. Import it

Now in your Dart code, you can use:


      import 'package:linkcheck/linkcheck.dart';
  
Version Uploaded Documentation Archive
2.0.0 Jun 15, 2018 Go to the documentation of linkcheck 2.0.0 Download linkcheck 2.0.0 archive
1.0.6 Apr 24, 2018 Go to the documentation of linkcheck 1.0.6 Download linkcheck 1.0.6 archive
1.0.5 Nov 23, 2017 Go to the documentation of linkcheck 1.0.5 Download linkcheck 1.0.5 archive
1.0.4 Dec 20, 2016 Go to the documentation of linkcheck 1.0.4 Download linkcheck 1.0.4 archive
1.0.3 Dec 17, 2016 Go to the documentation of linkcheck 1.0.3 Download linkcheck 1.0.3 archive
1.0.2 Dec 17, 2016 Go to the documentation of linkcheck 1.0.2 Download linkcheck 1.0.2 archive
1.0.1 Dec 17, 2016 Go to the documentation of linkcheck 1.0.1 Download linkcheck 1.0.1 archive
1.0.0 Dec 16, 2016 Go to the documentation of linkcheck 1.0.0 Download linkcheck 1.0.0 archive
0.2.14 Dec 7, 2016 Go to the documentation of linkcheck 0.2.14 Download linkcheck 0.2.14 archive
0.2.13 Dec 2, 2016 Go to the documentation of linkcheck 0.2.13 Download linkcheck 0.2.13 archive

All 22 versions...

Analysis

We analyzed this package on Jun 19, 2018, and provided a score, details, and suggestions below. Analysis was completed with status completed using:

  • Dart: 2.0.0-dev.63.0
  • pana: 0.11.3

Scores

Popularity:
Describes how popular the package is relative to other packages. [more]
0 / 100
Health:
Code health derived from static analysis. [more]
88 / 100
Maintenance:
Reflects how tidy and up-to-date the package is. [more]
87 / 100
Overall score:
Weighted score of the above. [more]
44
Learn more about scoring.

Platforms

Detected platforms: Flutter, other

Primary library: package:linkcheck/linkcheck.dart with components: io, isolate.

Suggestions

  • Running dartdoc failed.

    Make sure dartdoc runs without any issues.

  • Fix analysis and formatting issues.

    Analysis or formatting checks reported 5 errors 2 hints.

    Strong-mode analysis of lib/src/worker/worker.dart failed with the following error:

    line: 8 col: 8
    Target of URI doesn't exist: 'package:stream_channel/stream_channel.dart'.

    Strong-mode analysis of lib/linkcheck.dart gave the following hint:

    line: 161 col: 9
    'allowMultiple' is deprecated and shouldn't be used.

  • The description is too short.

    Add more detail about the package, what it does and what is its target use case. Try to write at least 60 characters.

  • Maintain an example.

    Create a short demo in the example/ directory to show how to use this package. Common file name patterns include: main.dart, example.dart or you could also use linkcheck.dart.

Dependencies

Package Constraint Resolved Available
Direct dependencies
Dart SDK >=2.0.0-dev <2.0.0
args ^1.4.3 1.4.3
console ^3.0.0 3.0.0
csslib ^0.14.1 0.14.4
glob ^1.1.5 1.1.5
html ^0.13.3 0.13.3+1
logging ^0.11.3+1 0.11.3+1
path ^1.6.0 1.6.1
source_span ^1.4.0 1.4.0
Transitive dependencies
async 2.0.7
charcode 1.1.1
collection 1.14.10
string_scanner 1.0.2
utf 0.9.0+4
vector_math 2.0.7
Dev dependencies
dhttpd ^2.0.0
test ^0.12.34