This is a short, and hopefully, easy to understand demonstration of hashing vs encryption.
What is hashing?
"Hashing" is performing a one-way process to a piece message or piece of information. There are many hashing algorithms to choose from, choosing the right one for you is outside the scope of this article. Lets look at an example.
If I take an email address such as
[email protected] I can hash it using the SHA256 hashing algorithm and it will produce
5cff28e4dd1f2281f8a02f001778b49bcdcd47a66b970791c9971326b28b3641. I can run this process as many times as I like on the same piece of information and will always recieve that same hash value back. However, there is no way to use the hash value to get back to the original content. I can never take
5cff28e4dd1f2281f8a02f001778b49bcdcd47a66b970791c9971326b28b3641 and get back to
[email protected]. This is why we hash passwords in databases, we never want anyone to ever get the original values back.
The weakenesses of hashing
Knowing that a hash is one-way you might now be wondering why we care about password leaks. If the organisation has hashed them properly, surely no one can figure out your password, right? It's true that no one could reverse the hash, however someone could take your very common password, hash it for themselves and compare the resulting hash value. Remember, every time you hash something it will have the same exact result.
Many attackers will have a collection of pre-hashed common passwords and it's then trivial to find all the passwords in a collection which match one of those hashes.
To help prevent this we may choose to include a "salt", this is an extra bit of data we attach to the original data before hashing. Now instead of hashing
[email protected] we instead hash
[email protected]_secret_sauce which produces a different result. Now the attacker would need to know the salt we use and would need to hash all their common passwords with this salt. There are many options when salting a hash such as having one master salt which is stored separately, or having an individual salt per hash stored alongside the hash. Deciding on the best salt approach is not something we're going to discuss in this article.
What is encryption?
We encrypt data when we want to be able to reverse the process later and get back to the original data from the encrypted result. This is why we use encryption rather than hashing on data at rest. If you hashed your hard drive, you would never be able to recover the data from the resulting hash.
With encryption we usually have some sort of "key", the encryption algorithm works in a different way to a hash and resultant encrypted data can only be reversed if you can present the key it has been encrypted for.
We often do this with the content of an email, if I wish to send you a secret message I could encrypt the message for your key. Once it's encrypted, the only person that can reverse that encryption is the person who holds the key.
There are many different encryption algorithms and methods, you'll often hear "key-pair" thrown around a lot which is worth giving a brief explanation of. A key-pair is where you have a public, which you can and should share, and a private key, which should be kept very secure and only be accessed by you. Someone can encrypt a message using your public key and it will only be reversible using your private key. No one can decrypt the message unless they have your private key, not even your public key can decrypt it. This is just one approach to encryption which is popular for communication between two people, but there are many others which are suitable for other situations.
Weakenesses of encryption
As you might have already guessed, one big weakness of encryption is that someone can decrypt the data if they have the right key. In many cases, such as communications, this is completely intentional and required. However, it is not appropriate for storing data like passwords.
The big difference between hashing and encrypting is that encrypted data can be reversed, decrypted, in some way. With hashing that is not possible, hashing is "one-way" and irreversalbe, the only way to get back is to guess the exact original message, hash it and see if it's a matches the hashed value you've obtained.
If I can't reverse a hash on a password, how could I ever check that the password the user has entered on log-in is correct?
You can simply hash the password the user has given you in the log-in request and then compare that to hashed password you have in your database.
What hash algorithm should I use for password storage?
I'm not a security expert so I'm not going to try and suggest which algorithms you should use but I will tell you that you should not under any circumstances use MD5, or SHA1. See OWASP for more information.
Can a hash really never be reversed?
If you're using a secure hashing algorithm then it will be impractical to try and reverse a hash, but that does not mean it's impossible. What we're really trying to achieve here is making it exceedingly unlikely you'd ever want to try, let alone actually accomplish it.
What about encryption, can that really never be reversed without the key?
Similar to hashing, vulnerabilities do exist in older less secure algorithms and as computers increase in power we need to up our game to ensure data remains secure. Once again, the aim of the game here is to make it close to impossible to reverse in a practical sense. If it will take someone 1 day to crack and encryption they might be ok waiting that long and then you're in trouble, on the other hand if it will take 20 billion years to crack (after the Sun has engulfed Earth and then subsequently died) you're probably in the clear until a vulnerability is found.