Xenu's Link Sleuth (TM)

A software to find broken links on web sites

Description

[Icon: Xenu with a fedora hat checking a link with his galactic looking glass]Xenu's Link Sleuth (TM) is a spidering software that checks Web sites for broken links. Link verification is done on "normal" links, images, frames, plug-ins, backgrounds, local image maps, style sheets, scripts and java applets. It displays a continously updated list of URLs which you can sort by different criteria. A report can be produced at any time.

Additional features:


Download

By downloading you are acknowledging that: System requirement: Microsoft Windows 95/98/ME/NT/2000/XP, WININET.DLL required (is usually included). No, it won't work on Windows 3.11, not even with Win32s. No, I won't make a Java, MacOS, Linux, Beos, Palm or C64 version. Don't even ask!

Attention CompuServe users: The old version of RPAWINET.DLL (e.g. from 18.9.1996) that came with the WinCIM 3.0 CD-ROM is deadly - go get the bugfix from CompuServe.

Ok, I have read all that, I want to download! (current version: 1.2a from September 29th, 2001)

 
Getting started:
Unzip it and install it wherever you want. To check a site, click the toolbar icon on the left and enter a WWW address. If the address finishes with a directory name, don't forget to put a / at the end or you will possibly get the whole parent directory spidered.

Incorrect:
http://www.host.com/~user

Correct:
http://www.host.com/~user/

You can also click the "browse" button to check a local HTML file. If you do not already use IE for browsing and are sitting behind a firewall, don't forget to configure your proxy before you start. To find out what the software can do, simply try out the menu choices, the toolbar and the right mouse key. Or read this excellent user's manual by Indiana University.

Good luck! If you find the software useful, please click here.

Join the Update Announcements mailing list at Yahoo Groups! To subscribe, send an empty e-mail to linksleuthupdates-subscribe@yahoogroups.com.
If you like to use a button on your WWW page, link to this page with this button: [Linkcheck by Xenu!]

The address of this web page is http://home.snafu.de/tilman/xenulink.html


Frequently Asked Questions (FAQ)

1. Who is Xenu?

See here.

Do you want to be a Knight of Xenu? Then join their team in the worldwide RC5-64 decryption effort. Join team #3504 after your decryption client has been working successfully for one day. (Attention: do not forget to configure the client with your own e-mail address, and remove any "(" or "<" from it). E-mail me if you have trouble setting up the client or configuring it.

2. Is Xenu's Link Sleuth (TM) better than WebAnalyzer?

Yes and No. Xenu's Link Sleuth (TM) does not have the graphic capabilities of WebAnalyzer 2.0 ("Wavefront view"). But here are some of the advantages of Xenu's Link Sleuth (TM): Xenu sez: check your website both with this product and with another product (WebAnalyzer, Linkbot, InfoLink, LinkScan, LinkAlarm and Theseus offer trial versions), and decide what you need and what you are willing to pay.

3. Is Xenu's Link Sleuth (TM) better than Net Mechanic?

Yes and No. IMO, Net Mechanic (a free WWW based service) is best to check very small web sites, but useless for the rest: An advantage of Net Mechanic is that you don't waste bandwidth - you submit your site and get an e-mail later that points to a WWW page with the results.

4. Can I support the author financially?

No need to. If you feel the software is useful, you may donate money to causes I support. Germans can make a tax deductible donation to the Dialog Zentrum Berlin e.V., Konto-Nr. 1551390051, Bank für Kirche und Diakonie BLZ 35060190.

Or visit the Xenu bookstore.

5. Why does Xenu's Link Sleuth (TM) report http://www.site.com/../page/index.html as broken?

The key is the "../" part. It means you have e.g. a top level page that links to a page in a directory above, which doesn't exist. It is true that Mozilla will not have any problems with such a page; but I am less tolerant.

6. How can I configure a proxy?

You can configure a proxy in the control application of Windows. Double-Click on the "internet" symbol, then click on the "card" of the dialog box that is named "Connection". You will need a proxy if you are sitting "behind a firewall". This is usually so in big corporate networks.

7. Why does Xenu's Link Sleuth(TM) report an URL with a space in it?

Either because you do have a space in the URL, or because you have a carriage return / newline in it. Although Mozilla tolerates this, I do not.

8. I use Mozilla 3.0 Gold and can't get rid of file: URLs for images. What can I do?

Re-edit the page, double-click on the picture, remove file: from the picture location and take care to uncheck "copy image to document's location" in the "properties" dialog box (at the bottom left) before you save and exit the dialog box.

9. What is the maximum number of websites that can be checked?

There is no maximum. It is limited by the memory on your computer.

10. Can the software check my site locally?

Since september 1998 (1.0n), you can do so without a local web server (your address would then be http://127.0.0.1). Use the "Browse" button in the "New" dialog box.

The results will not always be the same as a "remote" check:

A user of IE 4.0 reported that when not online, the software checks every "remote" URL like a local file. This is a problem of the newer version of the WININET.DLL; the version with IE 3.0 reports "no connection" or "no such host" instead, which is more logical.

11. Does it work on Windows NT 3.51?

One user said it worked fine after he copied a version of WININET.DLL from a Windows 95 system standing nearby, and put it into the directory where Xenu's Link Sleuth(TM) was installed.

12. How is it so damn fast?

Because it uses a (possibly patented, see patents here and here) technique known as preemptive multithreading. It means that the link checking software retrieves several web pages at the same time; the competition uses the same technique. The maximum count of threads is initially set to 30, but you can configure it to any number between 1 and 100. A number that is too high might result in failed connections or in timeouts, which means you will have to recheck the broken links. At the time I had a dial-up connection, I got good results with 70. Now I have a DSL connection, and I have to set the number to 1-5. I suspect that my DSL provider has installed a brake somewhere to prevent "commercial" customers from using the unexpensive "private" service.

Initial tests suggest that my software is faster than WebAnalyzer 2.0. This may also have to do with the fact that WebAnalyzer is wasting time by displaying more graphics.

13. Can I have the source code?

Hahahahahaha!

14. Can I buy the source code?

Sure, make me "an offer I can't refuse".

15. Just for fun, I checked Tilman's web site, and found many broken links. Why?

I check my own web site every week on friday. Nevertheless there are always broken links:

16. How do I correct broken links?

Repairing broken links (i.e. getting the correct ones) is a difficult task that takes time, but with experience, you'll get it done faster and faster.

17. What about ftp and gopher sites?

Starting with version 1.0k I have implemented a new ftp checking method that is 100% reliable. Sadly, this method does not work with proxies. The previous method I used (and still use for gopher) was unreliable, as it did not detect certain errors.

The method for checking gopher sites is still unreliable. When an ftp or gopher site is accessed through a proxy, this proxy builds up a web page. Sadly, it doesn't always bring up the information whether the URL exists or not. When you access a gopher site without a proxy, it brings an error message, but not an error code. This seems to be a bug of the OpenURL() function of WININET.DLL.

The output lists ftp and gopher sites as links, which allows you to make a manual check of these sites.

18. Why can't I launch URLs?

Starting with version 1.0g (Christmas 1997), URLs are launched with DDE ("dynamic data exchange", a windows method of communication between applications), to open many browser windows but to prevent the opening of several Netscape applications. This is done with the help of the Registry, by searching for HKEY_CLASSES_ROOT\http\shell\open. This has the path for the browser, the DDE application name (e.g. "Netscape"), the DDE topic (usually "WWW_OpenURL"), and a template for the DDE item (usually "%1"). If you cannot launch an URL, do not panic - export and e-mail me the segment of your registry (start REGEDIT.EXE, and search for "http").

The cause is usually that you have not installed Netscape properly (maybe you just transferred the files from another computer). Solution: reinstall Netscape over your current installation.

Starting with version 1.1b, I have stopped displaying an error message when the registry is incomplete, because there were too many complaints. Instead, the browser will simply be launched with the page. This has the disadvantage that the page won't be displayed in an extra window of the current active browser application.

19. Why is LinkSleuth messing around with cookies?

If you ask this, then you have configured your internet configuation to be asked before submitting a cookie, and get constantly requests. But sadly I am not responsible for this - it is a part of Microsoft's WININET.DLL. According to Cookie Central, there is not much you can do.

20. Why can't I check links into the Internet Movie Database?

Since August 1998, the internet movie database prevents the software from checking on their site. Apparently, someone misused my software, which put a tremendous load on their server. It would be easy for me to fool their protection mechanism, but this would also mean that no websites could protect themselves; I want to be a good netizen, and I don't want to make it too easy for people to misuse my software.

21. Why can't I connect to "secure" (https) sites ?

If you have set your proxy correctly, try to connect with IE. If this doesn't work, read this usenet post for help. If this still doesn't work and you use Windows NT 4.0, install the latest NT service packs (up to SP5).

22. Why don't you include searching for orphan files?

"Orphan files" are files that are not linked at all. I cannot do it, because one isn't always able to access the directory, usually for security reasons, so that people don't know what files are actually available. Even if I would implement such a search on the local disk, it would be useless for the remote server.

23. Any known problems with Windows 95?

Some people have reported crashes. These problems were usually solved by installing IE 3.0 (or higher) or the following service packs: One guy had problems with the WININET.DLL (v. 4.70.1300) installed with OEM Windows 95 (v. 95 4.00.950 C). Changing to version 4.70.1335 solved the problem; he said he found it at ftpsearch.lycos.com

A simpler solution is to go to http://windowsupdate.microsoft.com and install whatever they tell you (you need to have IE 4.0 or higher on your system)

24. Any known problems with Windows 2000?

Although I received many reports that it runs fine, one user reported a problem and a solution:

Windows 2000 automatically sets a configuration option to use HTTP 1.1 for connecting to web sites. Many, many web sites do not use that version but continue to use HTTP 1.0, so the automatic setting may prevent connections. This is the reason why Xenu would not run for me. When I disabled that setting, Xenu performed properly.

To disable that setting: Control Panel -> Internet Options -> Advanced (tab) -> HTTP 1.1 settings (list heading) -> Use HTTP 1.1 (checkbox: uncheck it)

25. Why can't I configure the timeout?

Because I can't... Microsoft Windows has a bug which prevents me from making it possible to users to configure it.

26. What about JavaScript?

The software does not check links generated by JavaScript, because JavaScript is a programming language, not a formatting language. This makes web pages dynamic; they e.g. depend on a mouse movement from minutes ago. While it would probably be easy to check JS links like javascript:newWindow('../popup/glossary.html#xenu') the problem is that not all JavaScript links are done this way. Many authors supply their own newWindow() function. If you have an idea for an easy solution, e-mail me.

27. What about passwords entered in a FORM?

The software is not able to enter passwords in a FORM. I just don't see a way to acomplish this easily. I assume it is possible if one combines a set of variable names, values, and a web page that would accept them with a POST command. I have not even taken the time to investigate how others do it; if you have an idea for an easy solution, e-mail me.

28. How about a WAP version?

This has been available since February 2001; it is currently available as a beta version in the file area of the LinkSleuthUpdates mailing list at Yahoo Groups.


Bug List

The software works pretty well, but here the list of things that shouldn't be. If you find another bug, e-mail me a description, please include the URL you are checking, and if possible try to save your work in a .XEN file and attach it. Also check http://windowsupdate.microsoft.com to make sure that your system has all the updates. If you want to e-mail a suggestion, click here.


Future feature List

Things I will do in the future (maybe when hell freezes over!):

The Story of Xenu's Link Sleuth(TM)

(for fellow software developers)

In April and May 1997 my employer assigned me on an out-of-town job, because another department needed a guy with MFC experience. So from monday to friday I was away, and on the evenings I was bored to death. Every week-end I was back home, and I usually checked my web site for broken links with WebAnalyzer. Sadly the software had a lot of bugs, and their support was ignoring my e-mails, and I was mad as hell, as I had spent quite a lot of money on a product that wasn't worth it. My job was also the first contact with VC++ 4.2 (previously I had only worked with VC++ 1.5, because our customers have a lot of 16bit systems), which had some easy-to-use Internet access classes. I had already experience with WINSOCK programming, but these classes would spare me a lot of time evaluating HTTP result headers and other annoying stuff. On an evening after an excellent italian food with a good chianti I took some hotel letter paper and wrote down a concept for checking links. A month later I took some time to install the development software on my computer and started working, with the help of that hotel-room concept. The work was done on some evenings, but mostly on week-ends, when I had more time.

My philosophy on software development has always been "smaller, simpler, cheaper", long before the NASA realized this. Because of that, I need no fancy (but totally useless) graphics like in WebAnalyzer. Just results. And they'd better be 100% correct or I'd have to kill myself :-)

[Visual C++ icon]The application is written in Visual C++, and uses the MFC classes as much as possible: CDocument, CView, CListView, CObArray, CMapStringToOb, CArchive, CInternetSession, CHttpFile, etc, etc. That saved me a lot of time! 


Credits

Icons in EXE file: Martin Hunt and Paul Campbell; Icon on web page: Erik Plummer; Idea to use banners in report: Marc Cross; Xenu logo: Fred C.; Volcano animated cursor: Juan C. Pradas-Bergnes; Idea & help with SMTP integration: Mark Findlay; SMTP class: P.J. Naughter; Xenu artwork: William C. Chenoweth

Trademarks

Xenu, Xenu's Link Sleuth and Link Sleuth are trademarks used by Tilman Hausherr for software products and services. These products are not associated in any way with services licensed by RTC, CoST, BPI, CSI, etc.

[Mozilla Open Directory Cool Site Award][ZDNet 4 stars Editor's pick][Nonags 6 best][Listsoft cool][Completely free software, five doves award]

Home | $cientology | Magic | Mozilla | Tilman | Deutsch | Bookstore

tilman@berlin.snafu.de