Tuesday, July 20, 2010

What is MD5 Hash and How to Use it?


In this post I will explain you about one of my favorite and interesting cryptographic algorithm called MD5(Message-Digest algorithm 5). This algorithm is mainly used to perform file integrity checks under most circumstances. Here I will not jump into the technical aspects of this algorithm, rather will tell you about how to make use of this algorithm in your daily life. Before I tell you about how to use MD5, I would like to share one of my recent experience which made me start using MD5 algorithm.
Recently I made some significant changes and updates to my website and as obvious I generated a complete backup of the site on my server. I downloaded this backup onto my PC and deleted the original one on the server. But after a few days something went wrong and I wanted to restore the backup that I downloaded. When I tried to restore the backup I was shocked! The backup file that I used to restore was corrupted. That means, the backup file that I downloaded onto my PC wasn’t exactly the one that was on my server. The reason is that there occured some data loss during the download process. Yes, this data loss can happen often when a file is downloaded from the Internet. The file can be corrupted due to any of the following reasons.
  • Data loss during the download process, due to instability in the Internet connection/server
  • The file can be tampered due to virus infections or
  • Due to Hacker attacks
So whenever you download any valuable data from the Internet it is completely necessary that you check the integrity of the downloaded file. That is you need to ensure that the downloaded file is exactly the same as that of the original one. In this scenario the MD5 hash can become handy. All you have to do is generate MD5 hash (or MD5 check-sum) for the intended file on your server. After you download the file onto your PC, again generate MD5 hash for the downloaded file. Compare these two hashes and if it matches then it means that the file is downloaded perfectly without any data loss.
A MD5 hash is nothing but a 32 digit hexadicimal number which can be something as follows
A Sample MD5 Hash
e4d909c290d0fb1ca068ffaddf22cbd0
This hash is unique for every file irrespective of it’s size and type. That means two .exe files with the same size will not have the same MD5 hash even though they are of same type and size. So MD5 hash can be used to uniquely identify a file. 
 

How to use MD5 Hash to check the Integrity of Files?

 
Suppose you have a file called backup.tar on your server. Before you download, you need to generate MD5 hash for this file on your server. To do so use the following command.
For UNIX:
md5sum backup.tar
When you hit ENTER you’ll see something as follows
e4d909c290d0fb1ca068ffaddf22cbd0
 
This is the MD5 hash for the file backup.tar. After you download this file onto your PC, you can cross check it’s integrity by again re-generating MD5 hash for the downloaded file. If both the hash matches then it means that the file is perfect. Otherwise it means that the file is corrupt. To generate the MD5 hash for the downloaded file on your Windows PC use the following freeware tool
MD5 Summer (Click on the link to download)
I hope you like this post. For further doubts and clarifications please pass your comments. Cheers!