The developers sitting opposite me speak a mix of English and tech:

‘I’m setting up a truffle scaffold’
‘I’ll dockerize it all, dockerize everything’
‘We need a contract-tracking contract’

Some of it is very cryptic indeed. One term that comes up a lot is ‘hash’. Hash is a special term for me, mainly because it is one of the few that I know something about.

There are many techniques and ideas that play a part in blockchain technology and cryptography in general. A hash is one of those key concepts but it is a bit difficult to explain, mainly because it is like nothing that has gone before. People try to compare it to familiar things: ‘A hash is like a fingerprint’, ‘It’s like a shadow’. But it’s not, basically a hash is a just a hash.

The word comes from the French for chopping something up, it’s also where we get the word hatchet from. So when we say; ‘you’ve made a hash of it’, we mean that you have really gone to town and chopped something up and there are bits everywhere, and in a way that is what doing a computational hash is all about.

It’s a way of taking a chunk of information, processing it (chopping it up and squashing it down) and making a short code. That initial chunk of information can be anything that is represented by numbers – and in the world of computing that can be a document, an image, an ID number, whatever you want.

A hash of Shakespeare

There are different sorts of hash functions but a frequently used one is SHA256 (a Secure Hash Algorithm that’s 256 bits long). It takes any information and generates a 256-bit long code, and at 8 bits to a character that makes 64 characters. So if I take my name ‘Lon Barfield’ and apply SHA256 I get:

09DAF4ED3969E854908AC6DEF267B72FE8EC3EF767490A4A9B03124B0C8073FA

I can take any length document and use it. If I try Shakespeare’s Sonnet 18: the one that begins ‘Shall I compare thee to a summer’s day?’ I get this:

192DEC1108A6F63B0BA22E925D1421328B51D984ADA69378E0060CD34D251614

I could put the whole text of the bible in, or a video of a baby goat, anything, and it would still come out as a 64 character code. A key requirement of hashes is that you can’t go the other way. So I can’t take that 64 character code and work backwards to regenerate Shakespeare’s Sonnet from it. Another requirement is that there is no discernable pattern to the hash. So If I alter one letter of the Sonnet and do the hash again I get a hash that is totally different, not just ever-so-slightly different.

Using hash codes

All well and good, but what’s the use of them? Why would I want to pack an entire document down to one little code that can’t be unpacked again? Well, one use for a hash is to validate documents. I make a huge video file and put it on the server. Then I take the hash of it and send this hash to all the people in the office. Anybody can hash the video on the server and compare the code they get to the hash I sent round, if it’s the same then they know that the video is as I left it. If the hashes don’t match then they know that someone has substituted the video with another one or tampered with the video in some way.

Here’s another use. If I send a message to someone there is a special way of encoding a document so that the recipient knows its from me (it’s called public/private key encryption). Basically, I encode a document with my private key and the recipient decodes it with my public key (that I have given them via another channel). Because they can decode it with my public key they know that only I could have encoded it, so it has to be from me.

Applying this encoding is quite a hard task, so encoding a large document or a video is not advised. But I can apply the encoding to something small like a hash value, so if I take the hash of the big document I get a 64 character key and that is small enough to easily have this public/private key encoding applied to it.

So I just send an ordinary document by ordinary email to the recipient, but I also send them this special public/private encoding of the hash of the document. They know that the hash is definitely from me (because of the public/private key encoding) and when they hash the document from the email they should see that the hash matches the one I sent them. They then know that the document in the email is definitely the one from me and they know that it hasn’t been tampered with in any way en route.

Hashes can also solve the problem of hackers stealing files containing millions of passwords from companies. If I have a system that loads of users log in to, then somewhere I need to store all their passwords. If this file of passwords gets hacked then I have big problems.

It’s the sort of thing you see on the news, such-and-such company loses 5 million account passwords and asks everyone to choose a new password. So how can hashes help?

Well it works like this: I hash all the passwords and store a file of the hashes, I don’t store the passwords anywhere. When someone logs in they type their password in, I hash it and then compare it to the hash for them in my file. If the hashes match then I know the passwords match so I log them in. If someone hacks in and steals the file with all the hashes in they can’t do anything with them, if they try and use a hash as a password the system will just hash it to a completely different hash code which won’t match.

The hash-daddy

Hashes are not new, they’ve been around since the idea was first worked out in 1953 by one of computing’s many unsung heros; a guy called Hans Peter Luhn. That’s him in the photo at the top, looking a bit worried. He worked in the print industry, then in communications in the first World War. He then moved into textiles engineering where he invented a range of devices before joining IBM as a research engineer. There he focussed on making number crunching faster and more efficient. If you look at the long number on your credit/debit card, that incorporates a special coding scheme that Hans Luhn came up with in that era.

So these ideas from a generation ago are now having a new lease of life in the world of cryptography and the blockchain. Computational hashes are strange, powerful and incredibly useful… a bit like the developers sitting opposite me in fact.

If you want to play around with hash codes you can do it online ‘passwordsgenerator.net/sha256-hash-generator,’ type or paste any text in and have a look at the hash code it generates.

If you’d like us to speak at an event, host a workshop or want to chat about a blockchain project or idea, get in touch with Simpleweb today.

Related Stories