HTML to TXT Conversion
Frequently, software processing textual data cannot directly handle HTML-files, which are the most common files found on the web. Below programs convert HTML into plain Text, many of them do batch conversion. To get an idea about the capabilities of these programs, you can get the plain-text-version of the current HTML page, as created by the different programs by clicking on the "Conversion Result"-links. NB: These conversion files are usually much better than conversions done with Microsoft™ Word™ (Conversion Result of Word™).
- Web2Text
Web2Text is a bare bones program, with which you can batch convert HTML files into TXT. It allows you to configure the most important options (such as line length) and yields decent results.
(Conversion Result) - Detagger
Detagger contains a few more customization options than Web2Text, so you should check, if you require the additional options (such as an restriction on the output file to contain only ASCII characters etc.). You can test a fully function version of this shareware, which currently costs $20 (US).
(Conversion Result) - HTMLtoTXT
If you do not require paragraph marks in HTML to be reproduced in the ASCII file, try freeware HTMLtoTXT.
(Conversion Result) - Markup Remover
This Windows 3.11 style tag remover was shareware and has some useful customization facilities, most importantly, it can convert to ASCII, iso 8859-1, and ANSI (for UNIX). Unfortunately, the program is no longer available.
(Conversion Result) - Microblast HTML to TEXT
Microblast's HTML to TEXT (shareware @ US-$ 10) features the most intuitive interface, but yields at best mediocre results. It is not customizable, it does not even allow for adjustments, not even the line breaks are configurable. Its "Open" and "Save" menus do not follow the Windows™ standard (there are no standard file type filters) and batch conversions are not implemented. (Conversion Result) - NoteTab
NoteTab is not a stand-alone detagger, but a full fledged ASCII/HTML-Editor. The shareware fee of US-$ 19.95 will yield a quick pay back, as it most effectively transforms HTML into plain text, as its results are very clean and you can batch convert many files.
(Conversion Result) - HTML Markdown
HTML Markdown was written for the PowerMac. - more HTML converters
PDF to TXT
- verypdf PDF2TXT
Shareware conversion tool batch concerts PDF documents into ASCII. - Advanced PDF Manager
Shareware for managing PDF files, which contains a batch conversion facility that turns PDF files into plain text. - pdftotext
Command-line interface freeware, part of the XPDF package.
Multipurpose Converters
- ABC Amber Textconverter
This shareware conversion tool performs conversions between many major file formats, namely:- ANSI (.txt)
- Unicode (.txt)
- Rich Text Format (.rtf)
- Microsoft™ Word™ (.doc)
- Corel™ WordPerfect™ (.doc)
- Lotus™ AmiPro™ (.ami)
- Microsoft™ Excel™ (.xls)
- Lotus™ 1-2-3™
- Adobe™ Portable Data Format™ (.pdf).
- Microsoft Word 2002 and later versions contain a batch conversion wizard.