extracting text from docx files

Anton Shterenlikht mexas at bristol.ac.uk
Thu Aug 11 11:23:15 UTC 2011


On Thu, Aug 11, 2011 at 12:14:51PM +0200, Polytropon wrote:
> On Tue, 9 Aug 2011 21:16:11 +0200, Christian Barthel wrote:
> > On Tue, Aug 09, 2011 at 02:36:32PM +0100, Anton Shterenlikht wrote:
> > > I often receive information in *.docx format
> > > from my MS using colleagues. Sometimes I can
> > > ask for a pdf (or similar) instead, but not always.
> > 
> > You have a lot of nice options: 
> > - Force them to use BSD/Linux ;)
> > - explain them, why docx is shit!
> > - don't read it
> 
> I also suggest to combine this with reading the following
> article:
> 
> http://en.nothingisreal.com/wiki/Please_don't_send_me_Microsoft_Word_documents
> 
> It's very polite and precise about why using "DOC" files
> is generally a bad idea. It can be easily concluded that
> it also applies to "DOCX" files.
> 
> The document also discusses alternatives.

That's not my war. It's not going to achive
much me telling all our admin and academic
staff that what they were tought throughout
their career might not be ideal, or even
not the only, tool in the universe.
Sometimes I can request pdf, sometimes I fail.

I also sometimes try to get pdf from various
UK govt departments. Sometimes they only
make documents available in MS formats.
Again, sometimes they respond well, but
mostly, they ignore my requests.

By the way, I tried abiword, and it couldn't
open my docx.

-- 
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 331 5944
Fax: +44 (0)117 929 4423


More information about the freebsd-questions mailing list