extracting text from docx files
ait at p2ee.org
Tue Aug 9 20:40:15 UTC 2011
On Tue, Aug 9, 2011 at 3:57 PM, Antonio Olivares
<olivares14031 at gmail.com> wrote:
>> But if you really, really need to read docx, you can try the web
>> application from Microsoft. A few months ago, I got also a lot of docx
>> and I opend it with the microsoft web app; this worked for me to extract
>> the information...
just a thought here but if docx is XML why not just find/build some
XSLT that extracts what you need into another format?
you probably have libxml2 and libxslt already in your system, and the
command line utility: xsltproc
there are probably already existing XSLT to transform to RTF and plain text.
More information about the freebsd-questions