What are hashes and hash functions?
A hash is a short digital fingerprint of data. A hash function is the algorithm that creates that fingerprint.
You can hash almost anything, like a password, a PDF, a photo, a contract, or a database export. The output is a fixed-size string that represents the input. People also call this output a hash value or a digest.
Hashes matter because they let you verify things quickly. Instead of comparing two huge files byte by byte, you compare their hashes. If the hashes match, the files are almost certainly identical.
Hash function explained in simple terms
A hash function takes data of any size and produces a fixed-length output.
Two practical consequences make hashes incredibly useful:
- The same input always produces the same hash.
- A tiny change in the input produces a very different hash.
That second point is why hashes are great at spotting tampering. Changing a single character in a document usually changes the digest completely.
Hashing vs encryption
Hashing and encryption are not the same thing.
Encryption is reversible. With the right key, you can recover the original content.
Hashing is meant to be one-way. You do not normally “reverse” a hash to get the original input back. Instead, you verify by hashing again and comparing the result.
A quick way to think about it:
Use encryption when you must retrieve the original data later.
Use hashing when you only need a fingerprint for matching or integrity checks.
What makes a cryptographic hash function different?
Some hash functions are built for organization and speed. Cryptographic hash functions are built for security.
In security-focused contexts, a good cryptographic hash function is designed to make it impractical to:
- find an input that matches a chosen hash
- find two different inputs that produce the same hash
This is why cryptographic hashes are used in digital signatures, secure identity systems, software distribution, and tamper-evident logs.
Common uses of hashes
File integrity checks
Software vendors often publish a digest next to a download. You hash the file you received and compare. Matching digests strongly suggests the file has not been altered.
Digital signatures
Many signature systems sign the digest of a message rather than the full message. This keeps verification fast even when the data is large.
Password storage
Passwords should not be stored in plain text. Instead, systems store derived values produced by specialized password-hashing methods designed to resist guessing attacks.
This is a special case: for passwords, you typically want algorithms that are intentionally slow and that include protections like salting.
Deduplication and content addressing
Storage systems can use hashes to detect identical files and avoid storing duplicates. Some systems also reference content by its digest.
Blockchains and audit logs
Hashes help link data together so that changes become obvious. If earlier data changes, later hashes no longer match what they should.
Hashing algorithms you will see in the real world
Here are names you will run into often:
- SHA-256 and SHA-512
- SHA-3
- BLAKE2 and BLAKE3
- MD5 and SHA-1 in legacy systems
If you are building something security-sensitive, it is worth checking current best practices and standards, because some older algorithms are no longer considered suitable for modern cryptographic protection.
Salted hashes: the explanation…
A salt is a random value added to the input before hashing, most commonly for passwords.
Why it matters:
If two people choose the same password, a salt helps make their stored digests different. That makes large-scale attacks harder and reduces the value of precomputed cracking tables.
FAQ about hashes and hash functions
Can a hash be reversed?
In general, no. Hashing is designed for verification, not for recovery.
Can two different inputs have the same hash?
In theory, yes. That is called a collision. Cryptographic hash functions are designed so that finding collisions is impractical in real systems.
Is a checksum the same as a hash?
Not really. Checksums are often designed to catch accidental corruption. Cryptographic hashes are designed to resist intentional tampering as well.
Why is password hashing different from file hashing?
File hashing is often optimized for speed. Password hashing is intentionally slower to make guessing attacks expensive.
Quotes & Sources
“A function that maps a bit string of arbitrary length to a fixed-length bit string.” (csrc.nist.gov)
“The result of applying a cryptographic hash function to data … Also known as a message digest.” (csrc.nist.gov)
“Modern hashing algorithms such as Argon2id, bcrypt, and PBKDF2 automatically salt the passwords.” (cheatsheetseries.owasp.org)
“NIST is announcing that SHA-1 should be phased out by Dec. 31, 2030.” (nist.gov)
“Given a randomly chosen message digest … it is computationally infeasible to find a preimage.” (csrc.nist.gov)