Using sed to crop bits of a PostScript file (Lord knows why)

I wanted to get the exact same bit from a lot (over 1000) of ps files. I’m sure there are better ways to do this — called ‘pscrop’, probably — but for some reason I did it as outlined below. First I converted all my ps to eps:

$ cat auto/pstoeps.sh
for f in *.ps;
do
ls $f
ps2epsi $f
done

then I created a little ‘new header’ with the bounding box I wanted (got bounding box coords from opening the file in gv):

$ cat auto/header.txt
%!PS-Adobe-2.0 EPSF-1.2
%%BoundingBox: 370 263 420 319
%%HiResBoundingBox: 370.0 263.0 420.0 319.0

Then I used sed to remove the old bounding box and header from all the many eps files, and then I replaced it with new bit

$ cat auto/extract.sh
for f in *.epsi;
do
ls $f
sed -i '/Bounding/d' $f
sed -i '/EPSF/d' $f
cat header.txt $f > extract_$f
done

(Note that this screws up the original eps(i) files, since ‘-i’ overwrites. Also, it deletes all lines with ‘Bounding’ or ‘EPSF’ in them; I checked that the ones I wanted to remove were the only ones in the file that had that text, but another file might coincidentally have that text in an important line, so care is required!)

so the top of the epsi file went from this:

%!PS-Adobe-2.0 EPSF-1.2
%%Title: 0.02_p02000.ps
%%Creator: Ghostscript ps2epsi from 0.02_p02000.ps
%%CreationDate: Oct 26 11:42
%%For:dgoossensdgoossens dgoossens
%%Pages: 1
%%DocumentFonts: Courier
%%BoundingBox: 5 6 566 565
%%HiResBoundingBox: 5.460047 6.860039 565.739983 564.339991
%%EndComments

%%BeginProlog

to this:

%!PS-Adobe-2.0 EPSF-1.2
%%BoundingBox: 370 263 420 319
%%HiResBoundingBox: 370.0 263.0 420.0 319.0
%%Title: 0.02_p02000.ps
%%Creator: Ghostscript ps2epsi from 0.02_p02000.ps
%%CreationDate: Oct 26 11:42
%%For:dgoossensdgoossens dgoossens
%%Pages: 1
%%DocumentFonts: Courier

%%BeginProlog

and rather than get the whole file, when I gv it I get a little window…

From this.

From this.

 

To this.

To this.

So yes, I’m sure there a better ways do do this, but this works pretty well. I can use various tools to them convert the eps file to some other format. All good. The handy thing about this method is you can fiddle with other bits of the file, like the metadata, duplexing instructions, font commands, and so on. A small change to the sed commands and rather than overwriting it is possible to create a new eps file, such that multiple bits could be cut from one file. And of course the whole point is that this is scriptable so I can do oodles of files. FWIW.

Avec.

Advertisements

Tags: , , , , , ,

About Darren

I'm a scientist by training, based in Australia.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: