Archive | editing RSS for this section

A simple pandoc/markdown cheat sheet

pandoc is a tool suite for converting between text formats. I find it useful as a writer (rather than a techie) to convert something written in plain text into an RTF file that I can read readily, with some formatting, in a word processor, up to and including the dreaded Word. What I outline here is a tiny fraction of what it can do.

It is as simple as:

$ pandoc -s file.txt -o file.rtf

(Replace ‘rtf’ with html etc to get something else.)

It can convert a file of nothing but typing, but if you use a little simple formatting, pretty much a version of markdown syntax, you can get a nice RTF with some useful extra features (italics, for example). For a heading, underline with equals or hyphens on the very next line;

Heading here


Heading here

when processed, that gives a heading. (Though for proper H1 H2 etc, see below.)

Paragraphing is determined by empty lines, so this text gives one paragraph in the RTF, and…

This text
also gives
one paragraph
in the RTF.

If I want to force a return, I end a line with a backslash.

This will not work inside back ticks (see below), which is the verbatim environment, so for example if I want a bunch of stuff verbatim, without interparagraph spacing, I end each line with a backslash outside the backtics:

`line 1`\
`line 2`\
`line 3`

will give

line 1
line 2
line 3

Formatting fonts

The word is the character used to delimit the modified text. The code, verbatim, is also given.

underscore _underscore_

dollar (math, which does not translate into RTF) $dollar$

caret (superscript) ^caret^

tilde (subscript) ~tilde~

2 tildes ~~tilde~~

2 underscores __2 underscores__

2 asterisks **2 asterisks**

3 asterisks ***3 asterisks***

back tick `back tick`

Note that when using back ticks (verbatim) the nesting of the instructions matters

***back tick outside 3 asterisks***

3 asterisks outside back tick

2 asterisks outside back tick

1 asterisk outside back tick

Other stuff

(Note, line above (‘Other stuff’) was underlined with hyphens on the immediate next line.)

nonbreaking space – precede a space with a backslash: ‘\⌴’ (I’m using ⌴ for a space)

block quote: > (greater than introduces a block quote indented by 1 tab)

block quote, again: >> same, bigger indent

So, for example

>This text would appear as a block quote indented by 1 tab.

H1 (1 hash, #)

So type:

# Here is my heading 1

H2 (2 hashes, ##)

## Here is my heading 2

H3 (3 hashes, ###)

### Here is my heading 3
  • bullet point (+ at start of line, then space)
  • bullet point (- (hyphen) at start of line, then space)
  • bullet point (* at start of line, then space)
  1. Numbered list
  2. Numbered list still
  3. Numbered list again

(just start a line with a number and a stop)

– 2 hyphens for en rule

— 3 hyphens for em rule


Some formats (eg HTML) like metadata to be specified. Simplest is to put

% title
% author(s)
% date

at the very top of the file (replace ‘title’, ‘author’ and ‘date’ with your own text).

Chalk it up.

gnu diction — find wordy and commonly misused phrases in texts

The gnu program diction searches text files for known poor constructions, based on a database. (See also gnu style.)

I installed it from GetGnuWin32.

For example:

$ diction -s makeemf.txt 
makeemf.txt:5: I have all the poppler stuff installed, [so -> (do not use as intensifier)]:

makeemf.txt:13: I use pdfcrop from TeXLive pdfcrop --margins "-52 -250 -20 -25" page4.pdf page4-crop.pdf [So -> (do not use as intensifier)] now I have my PDF.

makeemf.txt:17: [Can -> (do not confuse with "may")] I import it into Word?

3 phrases in 14 sentences found.

The things to be checked are enclosed in square brackets. The -s causes diction to provide suggestions.

Tools to be used along with it might include unrtf, unhtml, libreoffice command line, and other tools that convert marked-up and formatted files to plain text.

Look here for interesting examples of how to make a custom style to find things in your documents:

The diction databases are, in my case, stored in:


I imagine on a Linux system they’d be in /usr/share, but I have not checked. Anyway, typically a file look like this:

$ head /cygdrive/c/Users/darren/installs/getgnuwin32/GetGnuWin32/gnuwin32/share/diction/en
 a considerable amount of much
 a large number of many
 a lot of Often obsolete, should sometimes be replaced by "many"
 a majority of most
 a man who
 a matter of concern (cliche, avoid)
 a need for need
 a number of many, several
 a particular preference for
 a small number of few
(and so on)

Note that the left is the phrase to check for — including any leading spaces — and then after a tab (must be a tab to distinguish it from spaces within the phrase) comes the suggestion, if there is one. Very simple!

Could easily make up a diction file to look for your pet hates or common errors.


Wherever you go, there you are.

LibreOffice command line — fabulously useful

The open source office suite, LibreOffice, is not only a very fine piece of software, it has some capabilities that leave many commercial products in the shade. One of them is a very powerful command line. It can do a lot of stuff without ever loading up the GUI. That makes it useful for batch processing of files and also for file format conversions.

For example, image you want to convert a Word (docx) file to PDF. You can open it in Word or LibreOffice and save to PDF. Or, at least on Linux (but Windows can do the same if you put the LibreOffice executables into your PATH), you can type:

$ libreoffice --convert-to pdf test.docx

(The $ is the prompt.) This will produce test.pdf, which you can then view or use as you see fit.

I have recently been playing around on Linux without the GUI (because I have an old netbook that gets a bit slow when lots of GUI stuff is loaded up).

Say I want to view a Word (or LibreOffice, or … whatever) document. My first port of call is a command like that above. Then I can type:

$ fbgs -xxl test.pdf

And I can view the PDF in the framebuffer, no X windows, no GUI, no lag. This is what it looks like:

Screenshot of text about using pandoc

A page of a Word document about pandoc, rendered as PDF by LibreOffice and viewed using fbgs

But of course this is really a corner case. Such a conversion is most likely to be useful when you want to convert a lot of files from one format to another. Opening them in the GUI and one by one saving would be a real pain. In Linux you could just type:

$ for f in *.docx ; do libreoffice --convert-to pdf $f ; done

and it would convert all the docx files in the directory.

Now, of course, the quality of the conversion is limited by LibreOffice’s ability to interpret the file format of interest. But I have found it useful.


Some random BibTeX bibliography style examples

Here we have some basic BibTeX bibliography styles, with a few of the key distinguishing features pointed out.

First, good old unsrt.bst

an unsorted bibliography style

Then, chicago (from natbib)

chicago manual of style

The Harvard-ish agsm style

the agsm style, derived from Harvard -- author -- year

And, last, an APA-based style (apalike, from natbib)

another author/year style


Well howdy

Science writing versus scientific writing

With the shift away from third person passive to active first person writing:

The boiling point of the compound was determined.


We determined the boiling point of the compound.

Now, for science writing (ie popular science), this is fine. For scientific writing, I am not so sure.

What do I mean by scientific writing?

I mean writing that has the same qualities as science itself. This is different from writing about science itself, or even writing about the results of science in a non-scientific way. We might call these latter two things ‘science writing’.

Science, to me, ought to be:

  • precise — even when uncertain, it should be precise about what is highly likely (give nothing can be ‘proven’), likely, possible or unlikely. In other words, science aims for precise and accurate results, but regardless of the precision and accuracy of the results is always precise about its degree of certainty
  • dispassionate — the whole point of science is that what humans think is not relevant to the correctness of a theory (even though what humans think will have led to the theory). The validity of an idea is tested against the universe, not against what people think; an experiment is a means of testing whether an idea is consistent with the way the universe operates

In my mind, scientific writing has these same attributes.

With scientific knowledge, it does not matter who did the experiment, only that it was done well so that the experiment really does (within whatever limits) test the idea’s validity. Of course, some people have a wonderful ability to think of an experiment no-one else has ever done. But once it has been done and written up, if repeated it should yield the same results.

So I don’t like the ‘We’ in the sentence above. It puts the experimenter at the focus — after all, they are now the subject of the sentence — when previously the result (the boiling point of the compound) was the subject. And the boiling point is what matters, and what (if done well) ought to be a valuable result for others to draw on.

Personalising scientific results allows a theory or experiment to be discredited by discrediting the theorist or experimenter. It puts a scientific result closer to the plane of an opinion or ideology than it ought to be, so making it easier to argue away. Science is the opposite of ideology. Ideology is the use of a framework of ideas to make decisions for you en masse and so avoid having to think. I’m not saying scientists never do that (they are humans), but when they do they are not doing science.

If the sense that science is objective (as much as any human activity can be) was more prevalent in the wider world, it would be harder for (for example) climate change denialists to get traction. And I can’t help thinking that maybe that objectivity ought to be embedded in the language of science, and that if we take it out we’re implicitly signalling that science is something less important and useful and relevant and non-ignorable than it is.

So while I can understand the shift to the more active and immediate in writing, and I agree with it in most cases, I find myself not so in favour of it when talking about science and its outputs. (Having said that, I’ve written plenty of papers that use ‘we’, sometimes at the behest of a coauthor, sometimes because working around it was just so clunky and wordy; but always with a nagging dis-ease.)

I guess I’m just as inconsistent as anybody.

Rant over.

Removing horizontal lines in Word

Soooo….. I typed a bunch of hyphens then hit Enter, and Word drew me a line. Fine, I wanted that. It separated what was done from what was ‘in process’. Now I want to get rid of the line.

Highlight and delete — no.

Highlight and use the border menu to choose none — no! I tried this suggestion, but without success.

I could move the line up and down, but I could not delete it. But then I could!

Page shows some text, then an empty line then a row of about 10 hyphens

Type ten or so hyphens

Page shows the hyphens now converted into a horizontal line across the page

After hitting Enter, we can see the little menu that allows us to undo the autoformatting

At this point, we can undo the line if we want to by clicking on the menu icon that comes up when Word automatically creates the line, and we could select Undo Border Line, Stop Automatically Creating Border Lines or Control AutoFormat Options. So at this point it is easy to remove the line.

Screen shows the menu beside the line

The autoformat menu we get on producing the line

But let’s say we want to keep the line for now and delete it later. We go on, typing some more text below  the line. The autoformat menu icon disappears and does not come back.

More text below the line

Now, how can we get rid of the line? First, highlight it by keyboard or cursor.

The line (well, just the first bit of it) selected

Select the line

Now, type Ctrl+Shift+n — the line goes away! Now, this is Word key binding for ‘Apply the Normal style’, which means you can get the same result by using the Styles pane or clicking Normal on the styles ribbon on the Home tab.

Screen shows

text above and below, but no line! Fixed!

If the Normal style does not work, you can also try the ‘Clear All’ option at the top of the Styles pane.

Screen shows the same document with the Styles pane opened

Using the Styles pane

No longer vexed

Pro-Cite 2.2 + WordPerfect 5.1 cheat sheet

From an old WordPerfect 5.1 document

This is a summary of part of the Basic guide to Pro-Cite, section ‘Creating a bibliography from a manuscript’.

1. Insert your citations in the WP document. Two ways:

  • author/date, like (Smith, 1994) or (Smith & Jones 1987)
  • record number, like (#324) where 324 is the record number in Pro-Cite.

2. Save the document, exit WP5.1.

3. Search for the citations in the document:

  • open Pro-Cite and the Pro-Cite database containing the records
  • in Main Menu, press M for search Manuscript
  • press O for Options and set appropriately
  • F10
  • select all database records
  • F to give the document File name (changing directory if need be), then Enter
  • in Main Menu, press M for search Manuscript, then Enter.

4. Generate the reference list to insert into your document:

  • go to Print menu and set your output format to WP5.1
  • Print the reference list
  • go to search Manuscript and ‘Clear manuscript order’
  • close Pro-Cite.

5. Insert the list into your document:

  • open your document in WP5.1, and move the cursor to where the list is to go
  • Shift+F3 to switch to the second document
  • open reference list, and copy it all
  • switch back and paste it in.


I prefer the 20th century.

Change name of author of existing comments in Word track changes

Let’s say you’ve got a document with tracked changes in it, and the reviewer changes and comments are supposed to be by a corporate name — say XYZ Editing (apologies if this business exists) but they have been put in as by Fred Smith, an employee of XYZ.

How do you change them?

Changing the author of comments is pretty straightforward.

Try the method at:

This uses a VBA script.

Press “Alt+F11” to open VBA editor. Paste the code and click ‘run’.

Here is the suggested VBA code:

Sub ChangeAllAuthorNamesInComments()
  Dim objComment As Comment

  ' Change all author names in comments
  For Each objComment In ActiveDocument.Comments
    objComment.Author = "XYZ"
    objComment.Initial = "X"
  Next objComment
End Sub

So, this changes all comments to be by XYZ. It’s no good if you want to change only some.

Now, for the tracked changes, there seems to be no simple solution. The best advice I saw was at, but none of it is really simple.

But, let’s say I want to change every occurrence of Fred Smith to XYZ. The procedure below seems to work. (NOTE!!!! Your mileage may vary, I am not responsible for any butchering of your documents! Make backups! Caveat emptor. No promises, no liability etc.)

  1. Make a backup. Say copy document.docx to document-mod.docx
  2. Open the document in Word
  3. Save it as Word XML Document (*.xml)
  4. Open document-mod.xml in a text editor, or if in Word be sure to edit it as plain text
  5. Do a global search and replace for Fred Smith replaced by XYZ. (Ditto for any other names that need changing.)
  6. Save
  7. Open document-mod.xml in Word and save as document-mod.docx
  8. Check that nothing’s gone funny, maybe using document compare tools and some spot checks

Note that this will also change any appearances of Fred Smith in comments, and possibly in the body text (though I think the body text is encoded, so should be ok), so do do a few checks.

I used Vim as my editor, so my Vim command was just:

:%s/Fred Smith/XYZ/g

and all were changed.

Note also that this will not change reviewer initials, so it is better to do the comments change first using the VBA script, then do this to change the tracks.

Here’s a screenshot to add variety to this posting:

Schreenshot showing the save as options in Word

Choose Word XML

This is what a file might look like in Vim:

Lots of tags in angle brackets

The XML output — plain text, but otherwise pretty obscure

Best of luck!

New e-learning modules about writing and editing – please help! Fabulous prizes to be won!

At Biotext, where I work, we are developing e-learning modules about writing, editing and presenting information, and we need your help.

To make sure we design modules that are as useful and accessible as possible, we’d love to hear about your preferences. Please complete our brief survey to tell us how you might use the modules and what topics you would like to see.

The modules will be part of the Australian manual of scientific style (AMOSS), which is being expanded to include general content and will be renamed the Australian manual of style (AMOS). The modules will include text, video, audio, images, interactive elements and quizzes.

By completing the survey you will have a chance to receive a free 1-year subscription to AMOSS/AMOS – 5 to be won!


Click here to go to the survey


Case-sensitive google search (a dodgy kludge of minimal utility)

Wanted to search for MET (to find out what the hell it is in the context of something I am editing). Clearly, with such a common set of 3 letters, case-sensitive would be desirable.

Now, one way would be to scrape the results into a file and then grep it.

A couple of things: one wants more than 10 results, and wants to stick the results in a text file.

The latter can be done using w3m, viz.:

$ w3m -dump "" > dumpfile

(this dumps the rendered page as text — can also dump the source, using -dump_source)

Now, how to get more than 10 results?


$ w3m -dump " > dumpfile

Now I can search through dumpfile for the exact match.

This can be elaborated (see here for an outline of the google search URL; for example, adding &cr=countryAU to the end finds sites from Australia). I can force pages to have met (MET, Met etc) by quoting it (and I’ve put singles around the outside, though no quote marks would probably work fine):

$ w3m -dump '"MET"+gambling+training&num=100' | grep MET
["MET" gambling train]
Refresh (0 sec) /search?q=%22MET%22+gambling+training&num=100&ie=UTF-8&gbv=1&
... Interventions Motivational enhancement therapy (MET) is another
limits on ..... motivational enhancement therapies (MET), Gamblers Anonymous.
cognitive- behavioral therapy for reducing gambling (MET+CBT). .... They
MET. Motivational Enhancement Therapy. MI. Motivational Interviewing ....
efficacy of three brief interventions (10 minute brief advice, 1-session MET,
Motivational enhancement therapy (MET) is a type of brief therapy that
therapy (MET) followed by additional cognitive behavioral intervention sessions

So MET might mean ‘Motivational enhancement therapy’ (still not sure).

Unfortunately, the resulting output does not always provide the URL to go to in a usable form, so the output of this search sometimes is only useful as input to a second, conventional search that finds the page using some of the other terms (eg you could search for “therapy (MET) followed by additional cognitive behavioral intervention sessions” to bring up the last page found by grep).

Not a magic bullet, but a useful tool.