This is really simple but what the hell, blog needs posts.
What are checksums? Basically, an algorithm creates some kind of code (string of characters) based on the bits and bytes in a file, and ideally the checksum is different if even a tiny change is made to the file.
So, for example, a website that hosts files might provide the file — say an iso disk image — plus a checksum (a string of characters). We download the file, which might be GB of data, then run the checksum program locally and if it matches the checksum the website posted, we know we downloaded the file successfully. It also has security aspects — if someone messes with the file but does not change the expected checksum, there will be a mismatch. Of course, if they compromise the file and then hack the website and change the posted checksum, all bets are off.
But how about a concrete example?
Let’s say I go to http://ftp.netbsd.org/pub/NetBSD/install-images/7.1.2/ and download an install disk for the AlphaServer that I bought off ebay for 99¢ (well, I paid $1).
So I download NetBSD-7.1.2-alpha.iso.
In the same directory on the server there are two files — MD5 and SHA512. I can look at those and I see:
MD5 (NetBSD-7.1.2-acorn26.iso) = 502dca7c8628b7583500ccf2436e6eba
MD5 (NetBSD-7.1.2-acorn32.iso) = cd03d084925ee592d85603dd2334b390
MD5 (NetBSD-7.1.2-alpha.iso) = 090c25b2fcb7e770a6ae5b615a96963c
MD5 (NetBSD-7.1.2-amd64-install.img.gz) = cbd76159c1ed5eb32a19a3b19eae8cfd
MD5 (NetBSD-7.1.2-amd64.iso) = 2765516ec1b2ed56923c623e890388f0
MD5 (NetBSD-7.1.2-amiga.iso) = 35a345c82c9b399d1fc4ee6553849de8
MD5 (NetBSD-7.1.2-arc.iso) = cf2abd42a8e1244430dd0536c686d0b9
MD5 (NetBSD-7.1.2-atari.iso) = 2010fe541156c9650d2a2566e0ceb7a0
(the file goes down a long way — NetBSD runs on almost anything).
And
SHA512 (NetBSD-7.1.2-acorn26.iso) = 533ae5b61f7e9a870dacdb3a4e57df8cc768f93d5cdda7407dcecbe42dbb9856421bbb73eacbad66d08e61498385b5b3571fa3d66c66afb8eef07a27600d603d
SHA512 (NetBSD-7.1.2-acorn32.iso) = 0367108ec724ca47609fa324c9d1012e43c8249978afd057d9410e5724307309e3e8899167f6b1f58816c232a7ee7dc64aa93acaf7940b7384ae177bedf0af1a
SHA512 (NetBSD-7.1.2-alpha.iso) = dce2431b41f656bd07baffb8b97a270e32261600231f631761ccbeeefb8c8fd437ff02ed5ed2253699f14aebc45bc79d7633c2bb8874b8d24596cd43b7b537fc
SHA512 (NetBSD-7.1.2-amd64-install.img.gz) = 3e625cd6335c9bba631e5aee7c40a4606b915b3d73aeeaba1b693c4d9a7ad627d1a3ac08b23144fa5f1e2b84c9fc4cd8faef2d4a681a79862f3b8c29c103b85c
SHA512 (NetBSD-7.1.2-amd64.iso) = aaccacbfa3ee5a497170025aed9426de1ef91f8f7ebdaf862bd178c4922a5db1b82171832c916acd7a4e2038dd9ec39ef1061c41d98c5b5af56e3165cc945539
where we can see that the SHA512 sums are longer and generally considered ‘better’.
So I’ve got the file on my Linux box. I just do the sums:
$ md5sum NetBSD-7.1.2-alpha.iso
090c25b2fcb7e770a6ae5b615a96963c NetBSD-7.1.2-alpha.iso
$ sha512sum NetBSD-7.1.2-alpha.iso
dce2431b41f656bd07baffb8b97a270e32261600231f631761ccbeeefb8c8fd437ff02ed5ed2253699f14aebc45bc79d7633c2bb8874b8d24596cd43b7b537fc NetBSD-7.1.2-alpha.iso
And I can eyeball these and compare and see that all looks fine.
There’s another thing to do.
I can create little files with the same format as the output from the checksum programs, but using the values from the server’s files. Say I make md5.txt and it has a single line in it that looks like this:
090c25b2fcb7e770a6ae5b615a96963c NetBSD-7.1.2-alpha.iso
I can then do this:
$ md5sum -c md5.txt
NetBSD-7.1.2-alpha.iso: OK
and the checksum program will compare the strings for me. SHA does the same.
In my case, I am trying to get an OS to work on a badly mistreated DEC AlphaServer 1200, largely for the hell of it. The CD drives are old and may not be reliable, so it is really valuable to be able to eliminate one source of failure from the checks I have to do.
Time wasters.