Hash Functions Explained: MD5, SHA-1, SHA-256, and SHA-512

Learn what hash functions are, how they work, and which algorithm to use. Includes examples, comparisons, and common pitfalls.

The Quick Answer

A hash function takes any input and produces a fixed-length string of characters (the "hash"). The same input always gives the same hash, but even a one-character change produces a completely different output. Hash functions are one-way — you cannot reverse a hash to recover the original data.

Which algorithm should you use? SHA-256 for most purposes. Avoid MD5 and SHA-1 for anything security-related.

What Hash Functions Actually Do

Think of a hash function as a fingerprint machine for data. You feed it any input — a single character, a paragraph, an entire file — and it produces a fixed-length "fingerprint." Two key properties make this useful:

  1. Deterministic: The same input always produces the same hash
  2. Avalanche effect: Changing even one bit of input changes roughly half the output bits

Here's what that looks like in practice:

Input SHA-256 Hash (first 16 chars)
hello 2cf24dba5fb0a30e...
Hello 185f8db32271fe25...
hello (trailing space) 98ea6e4f216f2fb4...

Every character changed — even though the inputs are nearly identical.

The Four Algorithms Compared

MD5 (1992)

  • Output: 128 bits (32 hex characters)
  • Status: ❌ Cryptographically broken since 2004
  • Speed: Very fast
  • Still useful for: Non-security checksums, cache keys, deduplication
  • Do not use for: Passwords, certificates, digital signatures

MD5 was designed by Ronald Rivest and was the standard for years. In 2004, researchers demonstrated practical collision attacks — generating two different files with the same MD5 hash. Today, collisions can be produced in seconds on a laptop.

SHA-1 (1995)

  • Output: 160 bits (40 hex characters)
  • Status: ❌ Broken since 2017 (Google's SHAttered attack)
  • Speed: Fast
  • Still used in: Git commit identifiers (legacy reasons)
  • Do not use for: Security applications

Google and CWI Amsterdam demonstrated the first practical SHA-1 collision in 2017 by creating two different PDF files with identical SHA-1 hashes. Major browsers stopped accepting SHA-1 SSL certificates before this.

SHA-256 (2001)

  • Output: 256 bits (64 hex characters)
  • Status: ✅ Secure — no known practical attacks
  • Speed: Moderate (optimized on modern CPUs)
  • Used for: SSL/TLS certificates, Bitcoin blockchain, code signing, file integrity, digital signatures
  • Recommended for: Most general-purpose hashing needs

SHA-256 is part of the SHA-2 family, designed by the NSA and published by NIST. Despite its origin, it has been extensively analyzed by the global cryptographic community with no weaknesses found.

SHA-512 (2001)

  • Output: 512 bits (128 hex characters)
  • Status: ✅ Secure — no known practical attacks
  • Speed: Can be faster than SHA-256 on 64-bit processors
  • Used for: High-security applications, some password hashing schemes
  • Choose over SHA-256 when: You need a wider security margin or are on 64-bit hardware

SHA-512 processes data in 64-bit chunks, which actually makes it faster than SHA-256 on modern 64-bit CPUs. The larger output provides a higher security margin against theoretical future attacks.

How Hashing Works (Step by Step)

Here's a simplified view of what happens inside a hash function like SHA-256:

1. Padding

The input message is padded to ensure its length is a multiple of the block size (512 bits for SHA-256). The padding includes the original message length, which prevents certain attacks.

2. Block Processing

The padded message is split into 512-bit blocks. Each block goes through 64 rounds of operations involving:

  • Bitwise rotations and shifts — rearrange bits within 32-bit words
  • Logical functions (AND, OR, XOR, NOT) — combine bits from multiple words
  • Modular addition — add values, wrapping around at 2³²
  • Constants — derived from the fractional parts of cube roots of prime numbers

3. Chaining

Each block's processing starts with the result of the previous block. The first block starts with fixed initial values. This chaining is what causes the avalanche effect — changing any bit in the input affects all subsequent blocks.

4. Output

After all blocks are processed, the internal state (eight 32-bit words for SHA-256) is concatenated to produce the final 256-bit hash, typically displayed as 64 hexadecimal characters.

Security Properties

A secure hash function must have three properties:

Pre-image Resistance

Given a hash h, it should be computationally infeasible to find any input m such that hash(m) = h. This is what makes hashing "one-way."

Second Pre-image Resistance

Given an input m1, it should be infeasible to find a different input m2 such that hash(m1) = hash(m2). This ensures you can't forge a different file that matches a known hash.

Collision Resistance

It should be infeasible to find any two different inputs m1 and m2 such that hash(m1) = hash(m2). MD5 and SHA-1 fail this property — SHA-256 and SHA-512 do not.

Practical Uses

File Integrity Verification

When you download software, the publisher often provides a SHA-256 checksum. After downloading, you compute the hash of your downloaded file and compare:

Expected:  e3b0c44298fc1c14...
Your file: e3b0c44298fc1c14...  ✓ Match — file is intact

If even one byte was corrupted or tampered with during download, the hashes will not match.

Password Storage

Websites should never store your password in plain text. Instead, they hash it and store the hash. When you log in, the system hashes your entered password and compares the result to the stored hash.

Important: General-purpose hash functions like SHA-256 are too fast for passwords. Attackers can try billions of guesses per second. Use specialized password hashing algorithms (bcrypt, scrypt, Argon2) that are intentionally slow and add random "salt" to each password.

Digital Signatures

To sign a document:

  1. Hash the document (producing a small, fixed-size digest)
  2. Encrypt the hash with the signer's private key
  3. Attach the encrypted hash as the "signature"

To verify:

  1. Hash the document independently
  2. Decrypt the signature with the signer's public key
  3. Compare the two hashes — if they match, the document hasn't been altered

Hashing the document first is essential because asymmetric encryption is slow on large data. Hashing reduces a large document to a small digest that can be signed quickly.

Blockchain

In Bitcoin and similar systems:

  • Each transaction is hashed to create a transaction ID
  • Each block contains the hash of the previous block
  • Miners repeatedly hash block data (with a changing "nonce") until the hash meets a difficulty target

This design means changing any transaction in any past block would change that block's hash, which would change the next block's hash, and so on — making tampering immediately detectable.

Git Version Control

Git uses SHA-1 (with plans to migrate to SHA-256) to identify every object:

  • Commits are identified by hashing the commit message, author, timestamp, and parent commit hash
  • Files (blobs) are identified by hashing their contents
  • Trees are identified by hashing the list of filenames and their blob hashes

This content-addressing means identical files are automatically deduplicated, and any change to the repository history changes all downstream hashes.

Common Pitfalls

Confusing Hashing with Encryption

Hashing is one-way: input → hash. There is no key, and you cannot get the input back.

Encryption is two-way: input + key → ciphertext → input. You can decrypt with the correct key.

They serve different purposes. Use hashing to verify data. Use encryption to protect data you need to retrieve later.

Using Fast Hashes for Passwords

SHA-256 can compute billions of hashes per second on a modern GPU. That's excellent for file checksums but terrible for passwords — an attacker can try an enormous number of guesses very quickly.

Password-specific algorithms like bcrypt deliberately slow down hashing (typically 100-1000ms per hash) and add random salt to each password, making brute-force and precomputed attacks impractical.

Trusting MD5 for Security

MD5 collisions are trivial to produce. In 2008, researchers used an MD5 collision to create a rogue SSL certificate accepted by all major browsers. If you're using MD5 for anything security-related, migrate to SHA-256.

Ignoring Encoding

The hash of a string depends on how it's encoded. The text "hello" encoded as UTF-8 produces a different hash than the same text encoded as UTF-16. When comparing hashes across systems, ensure both sides use the same encoding (UTF-8 is the standard).

When to Use What

Scenario Recommended Algorithm
File download verification SHA-256
Password storage bcrypt, scrypt, or Argon2 (not plain SHA-256)
Digital signatures SHA-256 or SHA-512
Blockchain / cryptocurrency SHA-256
Non-security checksums MD5 is acceptable; SHA-256 is better
Cache keys / deduplication MD5 or SHA-256
HMAC (message authentication) SHA-256

Try It Yourself

Use the hash generator to hash any text and see MD5, SHA-1, SHA-256, and SHA-512 output side by side. Try changing a single character and watch how every hash changes completely.

For password hashing specifically, use the bcrypt generator which includes salting and configurable cost factors.

Further Reading

Related Tools