Formatted text conversion

Roland Smith rsmith at
Wed May 27 19:14:54 UTC 2009

On Wed, May 27, 2009 at 08:41:56AM -0700, Kelly Jones wrote:
> I have e-books in several formats (DOC, LIT, PDF, RTF, HTML, TXT,
> etc). Is there a Unix command-line tool that converts between these
> formats?

Not a single tool. Although some conversions are possible using
different tools. Applications are listed as available under /usr/ports
unless stated otherwise. Ports that are marked with * are those that
I've used with reasonable results myself.

RTF -> HTML: textproc/rtf2html or textproc/unrtf
TXT -> HTML: I've used a simple perl script to do this in the past, but
       	     I guess the textproc/txt2html does something similar.
TXT -> PDF: print/nenscript or print/enscript-letter to make postscript
            files from text, then ps2pdf from print/ghostscript8 to
	    create PDF from the postscript files. *
PDF -> HTML: pdftohtml from graphics/poppler-utils *
HTML ->PDF: Firefox supports printing to a PDF file.

It seems LIT files are based on MS' CHM format. Maybe textproc/chm2pdf
will convert them to pdf?

There is an open-source tool for e-books (LIT format, among others): It is not available via ports though.

> If not, is there at least a tool that converts these formats to TXT?

DOC  -> TXT: textproc/antiword *
HTML -> TXT: textproc/html2text
PDF  -> TXT: pdftotext from graphics/poppler-utils *


P.S. A lot of public domain e-books are available in different formats
via Project Gutenberg []
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url :

More information about the freebsd-questions mailing list