Archive by Author | Darren

Stop Word opening on startup

I have a laptop. Windows 10. I have a paid-for Office 2010 on it, but Windows insists on opening Office 365 (or whatever it is) on startup and asking me to log in to my Microsoft account and register the product.

Task manager → Startup does not show it.

Startup folder in the start menu is empty.

Ran msconfig and in Services disabled Microsoft Office Click-to-run service and Microsoft Account Sign-in Assistant


Yep, that seemed to work..

Successful science writing and editing at Biotext

On 13 Nov 2019, we (that is to say Biotext, where I work) will be running our popular course about writing reports, thesis and other documents with technical and scientific content. It will take place in our Bruce offices (that is, in Canberra) from 0930 to 1430.

To register or just have a look, you can go to EventBrite.

Click for event brite page

To get a look at the course outline, click on this thumbnail.

Successful science writing course outline

(Main content repeated in text at end of posting.)

Contact me if it’s of any interest. This is a commercial product, and I would not normally post on such a topic, but we’re a small business and we get our message out how we can.




Course topics

Common problems – looks at examples of typical science writing from a range of sources.

Where to start – outlines a set of questions to ask when first working on a document.

Writing clearly and succinctly – looks at how to avoid pitfalls of scientific writing such as overuse of jargon, passive voice and weak verbs and nouns.

Improving documents through editing – explains where substantive editing fits in the process of producing a document; and covers the different aspects of this stage, such as overall structure, content and logical flow.

‘Bare bones approach’ – outlines how to use a simple checklist to determine what level of editing a document requires. The checklist is useful in assessing your own work or in giving feedback to others.

My ORCID QR code

Never used one myself, but here it is! ORCID is what it says here.

QR code for my ORCID account


QR code for my ORCID account

Whatever, thumbnail size.



Me there

Build Alpine email client on and for Cygwin

The main website is at:

Releases are at:

And as of today (August 2019), this gets the current version of the source (you can just go there in a browser if you prefer):

$ wget

And build instructions are at:

But, for cutting edge, we can always explore the git repository:

$ git clone
$ cd alpine
$ ./configure --help
$ ./configure --with-passfile=.pine-passfile

But it throws an error:

checking Openssl library version >= 1.0.0c... configure: error: Install openssl version >= 1.0.0c

OK, install some extra development packages , ca-certificates, etc (what’s missing will depend on what you’ve got installed — this is just my experience — the Cygwin package search page can be helpful here) — either using apt-cyg or Cygwin’s setup.exe program (I found apt-cyg worked better):

$ apt-cyg install libssl1.0-devel openssl-devel
$ ./configure --with-passfile=.pine-passfile
$ make

Note openssl-devel is obsoleted, and may need to uncheck the ‘hide obsolete box in setup.exe, and try version 1.0.2 or whatever it is.

But there’s another error, this time when makeing:

sdep.c:43:10: fatal error: crypt.h: No such file or ...

So install some more development libraries — I am not clever, so I just did all the ones with ‘crypt-devel’ in the name:

$ apt-cyg install libcrypt-devel libgcrypt-devel libmcrypt-devel
$ make
$ make install
$ alpine
screen shot

Alpine running on Cygwin in MATE terminal

Then we configurate it.


Install Julia on Cygwin — use minGW

The Julia language is interesting — interactive yet fast, apparently. Here is installing from source on Cygwin.

$ git clone

(around 170 MB)

Used setup-XXX.exe or apt-cyg to install dependencies; I had to add patch. Then:

$ cd julia
$ make

... some output to screen ...

Makefile:242: *** can't find lib.dll. Stop.

No luck. I noticed that the install looks for minGW compilers, so I tried installing the mingw toolchain, including the minGW Fortran 7.4 packages. Here are the minGW pakages I installed:

$ apt-cyg list mingw

I don’t know if all these are needed — I doubt it (probably don’t need the i686 toolchain) — but I’m not looking for a minimal solution, just one that works, so I hit it with everything. Then:

$ make clean
$ make
$ make install

Note that the make-ing includes some downloading of extra material — so ought to be online while doing it.

$ grep http Makefile
	@echo 10. Upload to AWS, update and links
	../contrib/windows/ \
	../contrib/windows/ \

Anyway, that seemed to work! The install did not put Julia in the path. Probably I needed to give it a prefix when running make. (Or I can just use it from here, or add the path to the julia.bat file to the path, or make a link, or … Anyway, that is left as an exercise for the reader.)

screenshot showing a simple interactive julia session

Julia in MATE-terminal on Cygwin

Or, since Cygwin runs on Windows, you could just install the Windows native package. Much easier!

Much easier.

Word count a PDF: some tests

OK, I took a typical image + text PDF and used Acrobat Reader (not Pro — v. 2019.012.20035) to save as text, then quickly removed elements that might fool a word counter (just by opening it in Vim and doing some searching and replacing) and then got a word count (using command line program wc). It was around 17600 — a large document, so good stats.

Now, for something more automatic.

Word count of the Acrobat output without editing (from wc) was 17700 — pretty close.

Same document imported into Word and counted was 17700 (in fact, both wc and Word gave 17702).

If I want to use Linux or Cygwin via the command line, I can run pdftotext and count the words in the output; I get 18100 words. suggests something a little more interesting.

First, get a list of unique ‘real’ words in the document:

$ pdftotext foo.pdf - | tr " " "\n" | sort | uniq | grep "^[A-Za-z]" > words

Then, check the document for each of these words, and count up how many real words are in the document:

$ pdftotext foo.pdf - | tr " " "\n" | grep -Ff words | wc -l

Result: 17400.

There’s a question here of fitness for purpose. From the point of view of an editor quoting on an edit, I don’t care if the element begins with a number or not — I have to check everything, whether it’s a ‘real’ word or a postcode. As a result the 17700 results seem closer to the mark. If I let a ‘word’ start with a numeral:

$ pdftotext.exe mydoc.pdf - | tr :punct: "\n" | tr " " "\n" | sort | uniq | grep "^[0-9A-Za-z]" > words
$ pdftotext.exe mydoc.pdf - | tr " " "\n" | grep -Ff words | wc

I get 17800. Not too bad.

I can try pdfminer:

$ cd installs/
$ mkdir pdfminer
$ cd pdfminer
$ mv ~/Downloads/pdfminer-20140328.tar.gz .
$ tar xvzf pdfminer-20140328.tar.gz
$ cd pdfminer-20140328
$ python install

$ mydoc.pdf | wc
4220 18124 127383

So, bigger again. But, if I apply the same (or similar — I did some experiments) corrections as to pdftotext:

$ mydoc.pdf | tr " " "\n" | sort | uniq | grep "^[0-9A-Za-z]" > words
$ mydoc .pdf | tr " " "\n" | grep -Ff words | wc
17785 17785 120154

I get 17800 again. If I manually remove from the ‘words’ list anything that looks like a page number (ie a 1 or 2 digit number with no punctuation, 69 or smaller since it is a 69 page document), and rerun the word count, I find I get 17700, which is the figure I am taking as correct. This suggests that leaving in the numerical terms is important if there are lots of numbers in a document, leaving them out is better if the only numbers are page numbers. gives 17400 if I only search for [A-Za-z] and don’t edit out page numbers. So much the same as pdftotext.

In conclusion, extracting using Acrobat Reader gave a word count about 100 larger (~0.6%) than what I got when I manually tidied up the document. The steps from (user math) came out about 200 low, but can be completely automated. If I allow ‘words’ to begin with digits — which is reasonable from the point of view of finding how many fields must be edited — the ‘math’ method gives me a number about 200 high, that is, about 1.1%, which seems fine. If I went to the trouble of removing ‘words’ from the word list that could match to page numbers, I might expect to get something closer.

Caveat: I have not tested this on many documents, so I don’t know how robust it is.



Installing MATE-terminal on Cygwin

First, I should say I have been using the excellent, though I think no longer maintained (or is it?) (but stable and pretty reliable), apt-cyg.

$ apt-cyg --version
apt-cyg version 1
The MIT License (MIT)
Copyright (c) 2005-9 Stephen Jungels

(In fact, the status of apt-cyg seems a little unclear — is it this apt-cyg? See here as well.)

$ apt-cyg install mate-terminal

And the install works fine. But:

$ mate-terminal
** (mate-terminal:1413): WARNING **: Error retrieving accessibility bus address: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
(mate-terminal:1413): GLib-GIO-ERROR **: Settings schema '' is not installed

Now, the first one is not a problem, but the second one is fatal. But I’ve seen that message somewhere before … so do this:

$ glib-compile-schemas /usr/share/glib-2.0/schemas/
Warning: Schema “org.gnome.crypto.cache” has path “/desktop/gnome/crypto/cache/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.crypto.pgp” has path “/desktop/gnome/crypto/pgp/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.system.locale” has path “/system/locale/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.system.proxy” has path “/system/proxy/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.system.proxy.http” has path “/system/proxy/http/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.system.proxy.https” has path “/system/proxy/https/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.system.proxy.ftp” has path “/system/proxy/ftp/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.
Warning: Schema “org.gnome.system.proxy.socks” has path “/system/proxy/socks/”. Paths starting with “/apps/”, “/desktop/” or “/system/” are deprecated.

And now it works …

screen shot of a MATE terminal window




Talking to the EP-44

Talking to the EP44 from a computer is dead easy. First, I wired up a null-modem cable (I ordered one, but it turned out to be a basic extension cable — wires were not crossed over. So I cut it in the middle and crossed them over myself).

Then I attached the cable to the EP-44 and to the serial port on the back of the computer (note, I used a real, hardware serial port for this, not a USB-to-serial converter).

Turned on the EP-44 and set it to terminal and to 1200 baud, 8 bit data.

On the computer, used stty to set /dev/ttyS0 to 1200 baud.

$ stty -F /dev/ttyS0 1200

$ stty -F /dev/ttyS0 -a

speed 1200 baud; rows 0; columns 0; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0; -parenb -parodd -cmspar cs8 hupcl -cstopb cread clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc -ixany -imaxbel -iutf8 opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke -flusho -extproc

$ sudo adduser -a -G dialout username

Then typed:

$ cat /dev/ttyS0 &

$ cat > /dev/ttyS0

The first cat command takes anything from the serial port and puts it on the screen — in UNIX, everything is a file, including the serial device. The second command (not cast off), sends whatever I type on the computer to the thermal printer. So one could use this a bit like the talk and ytalk programs on UNIX.

That’s it, it works.

screenshot of a terminal session

Talking to the EP-44: Text on the screen of the laptop

scan of the same text on the EP44 printout

The same text on the EP44 printout

Now, the same connection can be used to send the contents of the printer’s memory (about a page or two of text) to the computer screen. You just press the Text button on the EP44 and cat the port to a file ($ cat /dev/ttyS0 > filename.txt).

Note that this is different from actually using the printer as a terminal, in that I am not sending commands to the computer and getting back the output, though clearly that can be done, seeing as the characters are moving between the two devices.

The text below was typed on the EP and uploaded to the computer using cat.


The Brother EP-44 can be used as an editor and word processor. It’s memory can hold about 3700 characters, which can be edited, modified and then either printed on the EP-44 or transferred via a null modem cable to a computer for printing, uploading or editing. The text you are reading now was written on the EP-44 without any of it being printed. Having said that, the 16-character screen of the EP-44 is not much good for editing. Indeed, if you get distracted and forget the start of the sentence, it can be all to easy to start one sentence and finish another.
I can check how many characters I have left by pressing CODE+r (‘REMAIN’). The manual, available online at gives details of the editing functions available. They are adequate, if not easy by modern standards. If you carried around enough equipment, it would be possible to send the text to an office over the phone lines via the RS232 port on the side, but it is hard to believe that anyone would bother. Even in the early 1980s, one of the little LCD notebook computers of the time, like the TRS-80 would be highly preferable (though three times the price). The main benefit of the EP lies in the attached printer, which means it is better used for input than output — input via the keyboard or via the serial line when using it as a printer. Speaking of input via the keyboard, I must say that the keyboard is surprisingly easy to use. It looks like a big calculator but types much better than that. Indeed, you can build up pretty high speed if you try. It works very well on the lap, and the keys are very reliable; you know when you’ve hit one, so you very rarely double hit or miss a character. The gaps between them help avoid hitting the wrong key, and mean that the overall dimensions are those of a full-sized device. Were I to make any change, I would put in a horizontal rather than vertical return key — why? — because I tend to hit it when I am looking for backspace.
One more comment. Although only showing 16 characters, the screen is surprisingly useful. It is big enough to show you the last word you typed, so you can quickly backspace over errors and fix them. If you are used to fixing errors on the spot rather than leaving them and going over the document later, then it is possible to make pretty clean copy without too much trouble.
As a final note, with this much text typed, I currently have 1280 characters left in the machine’s memory. Hence, we can see that the EP-44 (or EP44, depending on which documentation you read) can certainly provide enough space to write up a blog post of more than adequate length, especially when discussing a topic as boring and redundant as this one!
Now, at the end, let’s add in the non-ascii characters and see what can be downloaded over the serial line.
1234567890-=qwertyuiop asdfghjkl;’zxcvbnm,./
+ []{}<> $ \ ;’ #| &!@ *
OK, that’s done. Now we have 719 remaining.


  • The text above between the horizontal lines makes up about 550 words, so we can estimate something like 650 words as the limit. Compare the list of characters that made it over the serial line with the type specimen:

scan of the typeface -- it's quite nice

  • The text above was sent by hitting CODE+s then text. Note that nothing seems to happen, but tail -f on the file works well and shows it to be transferred. To empty the memory and prevent the content getting printed at an inconvenient time, the simplest thing to do is turn the machine off, then turn it on with the C key (red cancel key) held down; this returns it to factory settings, a bit of a nuclear option. It is not supposed to print what is sent to the computer, but it does, so maybe there’s a bug in mine, I don’t know, but this will do.

Over and out


I had a go at installing LOLCODE on Cygwin, though something similar will work anywhere Unix-y (eg apt-get or yum or whatever instead of apt-cyg — should use su or sudo if installing globally or need to install dependencies — noted but not used on Cygwin).

$ cd installs/
$ mkdir lol
$ cd lol
$ [sudo] apt-cyg[get] install cmake
$ git clone
$ cd lci/
$ chmod +x
$ [sudo] ./ -d -t
Running cmake with command:
"cmake ."
and writing results to configure.out.

Running make and writing results to make.out.

Building documentation and writing results to docs.out.

Installing and writing results to install.out.

Testing and writing results to test.out.

$ lci --help
Usage: lci [FILE] ...
Interpret FILE(s) as LOLCODE. Let FILE be '-' for stdin.
-h, --help output this help
-v, --version program version

username@SAJ-C-15 ~/installs/lol/lci
$ lci --version
lci v0.10.5
$ cd ../
$ pwd
$ vim
$ cat
#! /usr/local/bin/lci
HAI 1.3


$ chmod +x
$ ./

(Go here: and and

$ ls -lh $(which lci)
-rwxr-xr-x 1 username Domain Users 1.2M Jun 17 14:10 /usr/local/bin/lci

I am wondering if there’s a way to call (say) shell commands from within it. That would be handy.

It’s an interpreter.

Climate strike!

On September 20th, 2019, Biotext, where I work, is joining the climate strike.

I congratulate the Biotext management for their wisdom and foresight.