pandoc is a tool suite for converting between text formats. I find it useful as a writer (rather than a techie) to convert something written in plain text into an RTF file that I can read readily, with some formatting, in a word processor, up to and including the dreaded Word. What I outline here is a tiny fraction of what it can do.
It is as simple as:
$ pandoc -s file.txt -o file.rtf
(Replace ‘rtf’ with html etc to get something else.)
It can convert a file of nothing but typing, but if you use a little simple formatting, pretty much a version of markdown syntax, you can get a nice RTF with some useful extra features (italics, for example). For a heading, underline with equals or hyphens on the very next line;
Heading here ------------
Heading here ============
when processed, that gives a heading. (Though for proper H1 H2 etc, see below.)
Paragraphing is determined by empty lines, so this text gives one paragraph in the RTF, and…
in the RTF.
If I want to force a return, I end a line with a backslash.
This will not work inside back ticks (see below), which is the verbatim environment, so for example if I want a bunch of stuff verbatim, without interparagraph spacing, I end each line with a backslash outside the backtics:
The word is the character used to delimit the modified text. The code, verbatim, is also given.
dollar (math, which does not translate into RTF)
Note that when using back ticks (verbatim) the nesting of the instructions matters
***back tick outside 3 asterisks***
3 asterisks outside back tick
2 asterisks outside back tick
1 asterisk outside back tick
(Note, line above (‘Other stuff’) was underlined with hyphens on the immediate next line.)
nonbreaking space – precede a space with a backslash: ‘\⌴’ (I’m using ⌴ for a space)
block quote: > (greater than introduces a block quote indented by 1 tab)
block quote, again: >> same, bigger indent
So, for example
>This text would appear as a block quote indented by 1 tab.
H1 (1 hash, #)
# Here is my heading 1
H2 (2 hashes, ##)
## Here is my heading 2
H3 (3 hashes, ###)
### Here is my heading 3
- bullet point (+ at start of line, then space)
- bullet point (- (hyphen) at start of line, then space)
- bullet point (* at start of line, then space)
- Numbered list
- Numbered list still
- Numbered list again
(just start a line with a number and a stop)
– 2 hyphens for en rule
— 3 hyphens for em rule
Some formats (eg HTML) like metadata to be specified. Simplest is to put
at the very top of the file (replace ‘title’, ‘author’ and ‘date’ with your own text).
The gnu program diction searches text files for known poor constructions, based on a database. (See also gnu style.)
I installed it from GetGnuWin32.
$ diction -s makeemf.txt makeemf.txt:5: I have all the poppler stuff installed, [so -> (do not use as intensifier)]: makeemf.txt:13: I use pdfcrop from TeXLive pdfcrop --margins "-52 -250 -20 -25" page4.pdf page4-crop.pdf [So -> (do not use as intensifier)] now I have my PDF. makeemf.txt:17: [Can -> (do not confuse with "may")] I import it into Word? 3 phrases in 14 sentences found.
The things to be checked are enclosed in square brackets. The -s causes diction to provide suggestions.
Tools to be used along with it might include unrtf, unhtml, libreoffice command line, and other tools that convert marked-up and formatted files to plain text.
Look here for interesting examples of how to make a custom style to find things in your documents: https://mrsatterly.com/diction.html
The diction databases are, in my case, stored in:
I imagine on a Linux system they’d be in /usr/share, but I have not checked. Anyway, typically a file look like this:
$ head /cygdrive/c/Users/darren/installs/getgnuwin32/GetGnuWin32/gnuwin32/share/diction/en a considerable amount of much a large number of many a lot of Often obsolete, should sometimes be replaced by "many" a majority of most a man who a matter of concern (cliche, avoid) a need for need a number of many, several a particular preference for a small number of few (and so on)
Note that the left is the phrase to check for — including any leading spaces — and then after a tab (must be a tab to distinguish it from spaces within the phrase) comes the suggestion, if there is one. Very simple!
Could easily make up a diction file to look for your pet hates or common errors.
The open source office suite, LibreOffice, is not only a very fine piece of software, it has some capabilities that leave many commercial products in the shade. One of them is a very powerful command line. It can do a lot of stuff without ever loading up the GUI. That makes it useful for batch processing of files and also for file format conversions.
For example, image you want to convert a Word (docx) file to PDF. You can open it in Word or LibreOffice and save to PDF. Or, at least on Linux (but Windows can do the same if you put the LibreOffice executables into your PATH), you can type:
$ libreoffice --convert-to pdf test.docx
(The $ is the prompt.) This will produce test.pdf, which you can then view or use as you see fit.
I have recently been playing around on Linux without the GUI (because I have an old netbook that gets a bit slow when lots of GUI stuff is loaded up).
Say I want to view a Word (or LibreOffice, or … whatever) document. My first port of call is a command like that above. Then I can type:
$ fbgs -xxl test.pdf
And I can view the PDF in the framebuffer, no X windows, no GUI, no lag. This is what it looks like:
But of course this is really a corner case. Such a conversion is most likely to be useful when you want to convert a lot of files from one format to another. Opening them in the GUI and one by one saving would be a real pain. In Linux you could just type:
$ for f in *.docx ; do libreoffice --convert-to pdf $f ; done
and it would convert all the docx files in the directory.
Now, of course, the quality of the conversion is limited by LibreOffice’s ability to interpret the file format of interest. But I have found it useful.
With the shift away from third person passive to active first person writing:
The boiling point of the compound was determined.
We determined the boiling point of the compound.
Now, for science writing (ie popular science), this is fine. For scientific writing, I am not so sure.
What do I mean by scientific writing?
I mean writing that has the same qualities as science itself. This is different from writing about science itself, or even writing about the results of science in a non-scientific way. We might call these latter two things ‘science writing’.
Science, to me, ought to be:
- precise — even when uncertain, it should be precise about what is highly likely (give nothing can be ‘proven’), likely, possible or unlikely. In other words, science aims for precise and accurate results, but regardless of the precision and accuracy of the results is always precise about its degree of certainty
- dispassionate — the whole point of science is that what humans think is not relevant to the correctness of a theory (even though what humans think will have led to the theory). The validity of an idea is tested against the universe, not against what people think; an experiment is a means of testing whether an idea is consistent with the way the universe operates
In my mind, scientific writing has these same attributes.
With scientific knowledge, it does not matter who did the experiment, only that it was done well so that the experiment really does (within whatever limits) test the idea’s validity. Of course, some people have a wonderful ability to think of an experiment no-one else has ever done. But once it has been done and written up, if repeated it should yield the same results.
So I don’t like the ‘We’ in the sentence above. It puts the experimenter at the focus — after all, they are now the subject of the sentence — when previously the result (the boiling point of the compound) was the subject. And the boiling point is what matters, and what (if done well) ought to be a valuable result for others to draw on.
Personalising scientific results allows a theory or experiment to be discredited by discrediting the theorist or experimenter. It puts a scientific result closer to the plane of an opinion or ideology than it ought to be, so making it easier to argue away. Science is the opposite of ideology. Ideology is the use of a framework of ideas to make decisions for you en masse and so avoid having to think. I’m not saying scientists never do that (they are humans), but when they do they are not doing science.
If the sense that science is objective (as much as any human activity can be) was more prevalent in the wider world, it would be harder for (for example) climate change denialists to get traction. And I can’t help thinking that maybe that objectivity ought to be embedded in the language of science, and that if we take it out we’re implicitly signalling that science is something less important and useful and relevant and non-ignorable than it is.
So while I can understand the shift to the more active and immediate in writing, and I agree with it in most cases, I find myself not so in favour of it when talking about science and its outputs. (Having said that, I’ve written plenty of papers that use ‘we’, sometimes at the behest of a coauthor, sometimes because working around it was just so clunky and wordy; but always with a nagging dis-ease.)
I guess I’m just as inconsistent as anybody.
Soooo….. I typed a bunch of hyphens then hit Enter, and Word drew me a line. Fine, I wanted that. It separated what was done from what was ‘in process’. Now I want to get rid of the line.
Highlight and delete — no.
Highlight and use the border menu to choose none — no! I tried this suggestion, but without success.
I could move the line up and down, but I could not delete it. But then I could!
At this point, we can undo the line if we want to by clicking on the menu icon that comes up when Word automatically creates the line, and we could select Undo Border Line, Stop Automatically Creating Border Lines or Control AutoFormat Options. So at this point it is easy to remove the line.
But let’s say we want to keep the line for now and delete it later. We go on, typing some more text below the line. The autoformat menu icon disappears and does not come back.
Now, how can we get rid of the line? First, highlight it by keyboard or cursor.
Now, type Ctrl+Shift+n — the line goes away! Now, this is Word key binding for ‘Apply the Normal style’, which means you can get the same result by using the Styles pane or clicking Normal on the styles ribbon on the Home tab.
If the Normal style does not work, you can also try the ‘Clear All’ option at the top of the Styles pane.
From an old WordPerfect 5.1 document
This is a summary of part of the Basic guide to Pro-Cite, section ‘Creating a bibliography from a manuscript’.
1. Insert your citations in the WP document. Two ways:
- author/date, like (Smith, 1994) or (Smith & Jones 1987)
- record number, like (#324) where 324 is the record number in Pro-Cite.
2. Save the document, exit WP5.1.
3. Search for the citations in the document:
- open Pro-Cite and the Pro-Cite database containing the records
- in Main Menu, press M for search Manuscript
- press O for Options and set appropriately
- select all database records
- F to give the document File name (changing directory if need be), then Enter
- in Main Menu, press M for search Manuscript, then Enter.
4. Generate the reference list to insert into your document:
- go to Print menu and set your output format to WP5.1
- Print the reference list
- go to search Manuscript and ‘Clear manuscript order’
- close Pro-Cite.
5. Insert the list into your document:
- open your document in WP5.1, and move the cursor to where the list is to go
- Shift+F3 to switch to the second document
- open reference list, and copy it all
- switch back and paste it in.
At Biotext, where I work, we are developing e-learning modules about writing, editing and presenting information, and we need your help.
To make sure we design modules that are as useful and accessible as possible, we’d love to hear about your preferences. Please complete our brief survey to tell us how you might use the modules and what topics you would like to see.
The modules will be part of the Australian manual of scientific style (AMOSS), which is being expanded to include general content and will be renamed the Australian manual of style (AMOS). The modules will include text, video, audio, images, interactive elements and quizzes.
By completing the survey you will have a chance to receive a free 1-year subscription to AMOSS/AMOS – 5 to be won!
Wanted to search for MET (to find out what the hell it is in the context of something I am editing). Clearly, with such a common set of 3 letters, case-sensitive would be desirable.
Now, one way would be to scrape the results into a file and then grep it.
A couple of things: one wants more than 10 results, and wants to stick the results in a text file.
The latter can be done using w3m, viz.:
$ w3m -dump "https://www.google.com/search?q=Jane+Austen" > dumpfile
(this dumps the rendered page as text — can also dump the source, using -dump_source)
Now, how to get more than 10 results?
$ w3m -dump "https://www.google.com/search?q=Jane+Austen&num=100 > dumpfile
Now I can search through dumpfile for the exact match.
This can be elaborated (see here for an outline of the google search URL; for example, adding &cr=countryAU to the end finds sites from Australia). I can force pages to have met (MET, Met etc) by quoting it (and I’ve put singles around the outside, though no quote marks would probably work fine):
$ w3m -dump 'https://www.google.com/search?q="MET"+gambling+training&num=100' | grep MET ["MET" gambling train] Refresh (0 sec) /search?q=%22MET%22+gambling+training&num=100&ie=UTF-8&gbv=1& ... Interventions Motivational enhancement therapy (MET) is another limits on ..... motivational enhancement therapies (MET), Gamblers Anonymous. cognitive- behavioral therapy for reducing gambling (MET+CBT). .... They MET. Motivational Enhancement Therapy. MI. Motivational Interviewing .... efficacy of three brief interventions (10 minute brief advice, 1-session MET, Motivational enhancement therapy (MET) is a type of brief therapy that therapy (MET) followed by additional cognitive behavioral intervention sessions
So MET might mean ‘Motivational enhancement therapy’ (still not sure).
Unfortunately, the resulting output does not always provide the URL to go to in a usable form, so the output of this search sometimes is only useful as input to a second, conventional search that finds the page using some of the other terms (eg you could search for “therapy (MET) followed by additional cognitive behavioral intervention sessions” to bring up the last page found by grep).
Not a magic bullet, but a useful tool.