In: Accounting
what is hash in digital forensic?
A hash value is a result of a calculation (hash algorithm) that
can be performed on a string of text, electronic file or entire
hard drives contents. The result is also referred to as a checksum,
hash code or hashes. Hash values are used to identify and filter
duplicate files (i.e. email, attachments, and loose files) from an
ESI collection or verify that a forensic image or clone was
captured successfully.
Each hashing algorithm uses a specific number of bytes to store a “
thumbprint” of the contents. The following is a list of hash values
for the same text file. Regardless of the amount of data feed into
a specific hash algorithm or checksum it will return the same
number of characters. For example, an MD5 hash uses 32 characters
for the thumbprint whether it’s a single character in a text file
or an entire hard drive.
There are also various length hashes within a family (SHA-1, SHA-256 et.) The most common hash values are MD5, SHA-1 and SHA-256. The longer hash values require more time to calculate and are designed to reduce the probability of a collision.
In short hash values are a reliable, fast, and secure way to compare the contents of individual files and media. Whether it’s a single text file containing a phone number or five terabytes of data on a server, calculating hash values are an invaluable process for Deduplication and evidence verification in electronic discovery and computer forensics.