Tips for Using Automated Link Checking Software

As you expect, gentle reader, even when it comes to checking the links on Web sites, I prefer manual testing, particularly at the onset of a Web site development project. That is, I do want to personally, with my own index finger, click every single link on every single page, including that repeated navigational menu bar that would never, ever change across the pages (the developers and designers say) and don’t tend to change except when they catastrophically fail, for no discernable reason, on a single page.

That’s not to say that automated link checking doesn’t have its place, because it does. The remainder of this piece talks about its place.


Your automated link checker, regardless of whether it came free from a foreign language site or as part of an elaborate complete testing package costing tens of thousands of dollars, will lack basic context and intelligence to determine if the link returns the right thing. If the Web server returns anything at all, your automated link checker will probably say, “Okay!” and will go on with its automated link checking.

Hence, your first pass through the Web site should always make sure that the link text or image leads to an appropriate link target. That is, that the text product information should lead to something like, oh, product information and not the contact page. Your automated software, as I said, will be happy as long as the Web server does not return a 404 error to the Web browser. Obviously, that won’t do.

Clicking through the Web site methodically will also make you familiar with all of the pages on the site and its layout. But that’s a positive thing unrelated to the theme of this piece.

Once you’ve done that—or have passed over the opportunity to do the right thing to do the quick thing or the sexy technical thing—you can run your automated link checker.

Automated link checkers are excellent for finding missing assets. They’ll identify when a page calls for a style sheet, image, or other file that isn’t present at the location where the page expects it. That’s very handy.

Before you run your automatic link check, though, you need to know how your Web server handles links to pages that don’t exist. Some Web sites return a handy 404 error, which your standard link checker recognizes as a broken link. Other sites, on the other hand, handle missing pages differently. Some create a separate error page that displays but that doesn’t fire off the 404 flare; an automated link checker that looks specifically for that error won’t know the links are broken. Other sites just display the home page when you request a page that doesn’t exist. If your Web site handles missing Web pages like that, you have to account for it when running an automated link check. Some checkers let you report on the URLs of the returned pages; this will let you look for pages that link to PageNotFound.aspx or whatnot, and that will help you find broken links that your automated link checker doesn’t realize are broken.

Next, you have to consider your environment and your Web site promotion process when running your automated link checker. If your organization follows good development process, someone is building it in a development environment and deploying it to a test environment for you to test, and then after you’ve logged all the issues and they ignore them, your tech people will deploy the sites to the production Web servers. Now, some URLs are relative and some are absolute, and you can absolutely count on that some “relative” URLs will point to your development or test environments. That is, instead of, the link will target your test environment that’s accessible only through hostfilejitsu or to people inside your corporate firewall. If you run your automated link checker from inside your firewall or with the properly tuned host files, those links will show as good; however, when your user outside the firewall clicks them, the links will fail. Again, if you have good reporting/sorting on your URL targets in your automated link checker, you can check to make sure that your site doesn’t link to the internal environments—but your automated link checker probably won’t find those problems on its own.

Those represent a couple of considerations you should make when selecting and using an automated link checker and illustrates a couple of places where the brute force of the software needs the intelligence of a quality assurance professional to help it out. Otherwise, you could just run some software facilely and have a false sense of security that it’s found all of your problems. On the other hand, maybe that’s how you do Quality Assurance in your organization. You wouldn’t be alone.

Comments are closed.

wordpress visitors