crgrep in 2022 — search for text in any file! (Windows) including Word, Excel and PDF

crgrep is a very useful utility that can be used to search through directories full of Word files, Excel files, PDFs and so on.

It is a little bit manual to install, but actually very easy, and ell worthwhile once it is working. Regrettably, it is not maintained.

Download from https://sourceforge.net/projects/crgrep/

Extract the folder to (in my case) c:\users\username\installs, so now I have a folder called

C:\Users\username\installs\crgrep-1.0.5

Read the bloody instructions in the install instructions file:

C:\Users\username\installs\crgrep-1.0.5\INSTALL.txt

Add bin folder to path; to do this, type ‘env’ in Windows search box, then choose Edit environment variables for your account then select Path, click Edit and then New then put “C:\Users\username\installs\crgrep-1.0.5\bin” in a new line and say ok.

Now I need to find Java. I have ImageJ installed. crgrep specifies Java 8 (jdk1.8.0_xx). I have installed ImageJ using the file ImageJ bundled with 64-bit Java 1.8.0_172, and I put it in my local installs directory, so I have the java.exe file in:

C:\Users\username\installs\ImageJ\jre\bin

I can run it to see what the version is:

C:\Users\username\installs\ImageJ\jre\bin>java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

Looks perfect!

Add this to the environment —  a bit like editing the path (above), but create a new environment variable:

JAVA_HOME=C:\Users\goossens\installs\ImageJ\jre

Do some tests

c:\> crgrep -help

OK, now for example, I want to search for ‘Fred Smith’ in all DOCX files in a tree of files starting at the current directory. I only want to look in files that have ‘final’ in the title, and I want to find occurrences that have a single character between Fred and Smith, but it may not be a space.

c:\etc\etc\etc> crgrep -r --colour=always "Fred?Smith" "**\*final*.docx"

What’s going on?

crgrep — the command

-r — recursively search subdirectories

–colour=always — turn occurrences of the desired pattern red in the output

“Fred?Smith” — the search pattern; ? means any one character, so this will find ‘Fred Smith’, ‘Fred-Smith’, ‘FredQSmith’ and so on. Note it is enclosed in quotes

“**\*final*.docx” — the pattern for the files to look in; ** means ‘dig through all subdirectories’, and *final*.docx means find all files that have file names of the form <some text>final< some text>.docx. Note it is enclosed in quotes

Works a treat.

Author: Darren

I'm a scientist by training, currently working as a writer, trainer and editor.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.