extracting text from docx files

Rod Person rodperson at rodperson.com
Tue Aug 9 13:40:28 UTC 2011


On Tue, 9 Aug 2011 14:36:32 +0100
Anton Shterenlikht <mexas at bristol.ac.uk> wrote:

> Usually I unzip a docx and then search
> through all *xml  files to find the
> useful data. However, I can't find any
> xml styles to use, so I have to convert
> the relevant xml file(s) to plain text
> by hand. I wonder if anybody can suggest
> a better way. Perhaps there's something
> in ports that can help.

You could try this for just plain text conversion
http://docx2txt.sourceforge.net/

-- 
Rod


More information about the freebsd-questions mailing list