extracting text from docx files

Kurt Buff kurt.buff at gmail.com
Tue Aug 9 17:25:32 UTC 2011


On Tue, Aug 9, 2011 at 06:36, Anton Shterenlikht <mexas at bristol.ac.uk> wrote:
> I often receive information in *.docx format
> from my MS using colleagues. Sometimes I can
> ask for a pdf (or similar) instead, but not always.
>
> Usually I unzip a docx and then search
> through all *xml  files to find the
> useful data. However, I can't find any
> xml styles to use, so I have to convert
> the relevant xml file(s) to plain text
> by hand. I wonder if anybody can suggest
> a better way. Perhaps there's something
> in ports that can help.

My installation of OpenOffice 3.3 on my Win7 machine will open a
Winword 2010 .docx file.

I'm guessing it will do the same on FreeBSD, but I don't have an
install with a GUI running at the moment.

Kurt


More information about the freebsd-questions mailing list