Xpdf provides pdffonts.
pdfinfo will tell you how many pages. Say 20. So at the Linux/Cygwin prompt, say you want to check for Times (in this case, it should not be there!):
$ for f in {1..20} ; do echo Page $f ; pdffonts -f $f -l $f mypdffile.pdf 2> /dev/null | grep Times ; done
the 2> /dev/null gets rid of warnings, like font weight missing. -f is first page, -l is last page — the same because we are stepping through.
My file should not have any Times New Roman in it, but often when there’s a wrong font, this is the one (because it is so default).
I can run this, and immediately see that I have TNR on pages 19, 8 and 7.
Now, the text may not be visible (PDFs are replete with invisible text, especially if they have images in them), but it helps me find out where to look.
It would be easy enough to write a script to work out the number of pages and feed it into this line; you could have something that takes the PDF name and the font you are looking for.
List your page lengths:
$ for f in *.pdf ; do echo -n "$f"...." " ; pdfinfo "$f" | grep Pages ; done
For a quick and very dirty solution, if your longest file is 20 pages, then:
$ for g in *.pdf ; do echo "$g" ; for f in {1..20} ; do echo Page $f ; pdffonts -f $f -l $f "$g" 2> /dev/null | grep Times ; done ; done
If you have the needed programs installed, you might use a script:
#! /usr/bin/bash ## Check fonts in a pdf file. ## v 1 22 Oct 22 ## filefont -h gives help, but so does just looking at the script. while getopts ":h" option ; do case $option in h) echo echo Usage is simple and limited: echo echo filefont filename.pdf fontpattern echo echo where fontpattern might be Times, say. echo echo Search is case sensitive unless you put -i in front of grep within this script. echo "(That is, you edit the script.)" echo "(Or you could just search for imes or oman or talic and skip the first letter...)" echo exit;; esac done FILE=$1 echo "$FILE" ## I use dos2unix because I am running Cygwin and I grab binaries from all over the place. PAGES=$(pdfinfo "$FILE" | grep Pages | cut -d ':' -f 2 | dos2unix) PAGE=1 while [[ $PAGE -le $PAGES ]] do echo Page $PAGE pdffonts -f $PAGE -l $PAGE "$FILE" 2> /dev/null | grep "$2" ((PAGE=PAGE+1)) done
This just shows a few things you can do in bash, as well.
The getopts bit just sees if the user has passed a -h option to the script, and prints out some help. More for my own amusement than anything else.
PAGES=$() takes the value produced by that line of instructions and puts it into the variable PAGES.
We run pdfinfo, use grep and cut to isolate the page number, then run that through dos2unix in case the string has the wrong line ending.
We then loop over the file page by page, checking for the bit of text in the second command line argument.
Crude, I know. But handy for checking hundreds of pages or many, many documents. So far it ha been more than useful. Even if such a thing takes time to write and debug, it finds instances that I had missed by eye.