A paper recently published by a couple of my colleagues was converted from a Word .docx file to a LaTeX file. We tried a whole bunch of the more common converters, and in the end could not get a good result without considerable manual labour. I have almost come to the conclusion that it is easier to just take the text as unformatted ASCII and add the formatting by hand.


The document, published at, looks at how questions in tests, in this case a very widely used diagnostic test from Physics, are not independent.  I have talked briefly about it before.

The text does not contain any figures, but contains several tables.  Tables are very rarely converted well by any of the usual tools.  I tried importing into LibreOffice then saving as LaTeX, and similarly with Abiwordword2latex was similar.  All got some things right, but all added enough stuff that I had to undo manually that the benefit was minor, especially because I had to fit the document into the journal’s document class (REVTeX 4.1, actually), which means that the preamble all had to be put together by hand.  So once I had to sort out the preamble manually, the tables manually, and fix much of the other text manually, I almost decided I might as well just get unformatted text out of Word and mark it up manually.  But in the end I used chunks of the LibreOffice converter output, cutting and pasting blocks of ‘good’ conversion in when it was sensible to do so and using manual conversion when it wasn’t.

A bit laborious.

I explored my options pretty thoroughly, converting the document in quite a few different ways, so I conclude that at this time none of the free and open conversion tools is really up to the task.  I can’t speak on the proprietary ones.



Oh well.


