Encryption Explained: How Your Data Stays Private

In June 2012, LinkedIn suffered a breach that exposed 6.5 million password hashes — or so they thought. Three years later, it emerged that the actual breach included 117 million passwords, all hashed with the SHA-1 algorithm with no salt. Within 72 hours of the full dataset appearing on underground forums, crackers had recovered approximately 90% of the passwords using GPU-accelerated dictionary attacks.

In January 2019, the collection known as "Collection #1" appeared: 773 million unique email addresses and 21 million unique plaintext passwords. Many of those plaintext passwords came from services that had used MD5 hashing with no salt, or in some cases had stored passwords in plaintext entirely.

In 2016, Dropbox disclosed that a 2012 breach had exposed 68 million password hashes. Unlike LinkedIn, Dropbox had used bcrypt — a deliberately slow, properly salted hashing algorithm. The cracking progress was measured in weeks rather than hours, and a substantial portion remained uncracked due to the computational cost.

These three incidents in the same approximate time window demonstrate that the difference between correct and incorrect implementation of one cryptographic primitive — password hashing — determines whether a breach exposes millions of users' passwords or merely hashes that resist cracking. The algorithm matters. The implementation matters more. Understanding both is the prerequisite for building or evaluating any secure system.

What Encryption Is (And What It Is Not)

Encryption is the process of transforming readable data (plaintext) into an unreadable form (ciphertext) using a mathematical algorithm and a key. The key reverses the process. Without the key, the ciphertext is computationally intractable to reverse.

Plaintext:  "Transfer $50,000 from account 9876543 to account 1234567"
    +
Algorithm (AES-256-GCM)
    +
Key (256-bit random value: a3f8c2b1e9d74056...etc)
    =
Ciphertext: "7f3a9c2b14e8d05f6a1c4b9e3d2f8a0c8b4e9f2c1a7d3e6..."

Ciphertext + Same Key + Same Algorithm = Original Plaintext
Ciphertext + Wrong Key + Same Algorithm = Garbage

Kerckhoffs's principle: The security of a well-designed encryption system rests entirely on the secrecy of the key, not the secrecy of the algorithm. AES, RSA, and ChaCha20 are publicly documented in detail — any cryptographer can read the exact algorithm. The security comes from the key, which has 2^256 possible values for AES-256. An exhaustive search of all possible keys would require more energy than exists in the observable universe.

This principle matters in practice: "security through obscurity" — hiding how the algorithm works — fails the moment the algorithm leaks, which it always eventually does. Systems designed around secret algorithms have been systematically broken throughout history. Systems designed around public algorithms with secret keys have not.

What encryption is not:

Compression. Compression reduces size by finding patterns. Encryption hides patterns. Compressed data does not hide the structure of the original — encrypted data does.
Hashing. Hashing is one-way. You can verify a hash but cannot recover the original. Encryption is reversible with the key. This distinction has directly caused breaches when developers used hashing where encryption was needed or vice versa.
Authentication. Encryption provides confidentiality. It does not prove who you are talking to. Authentication requires certificates, signatures, or key agreement protocols. A message can be encrypted but come from an impersonator — encryption does not help you here.

Symmetric Encryption: One Key, Both Operations

Symmetric encryption uses the same key for both encryption and decryption. The sender and recipient both need this key, and it must remain secret.

Key: K
Encrypt: Plaintext + K → Ciphertext
Decrypt: Ciphertext + K → Plaintext

If the key is compromised, all messages encrypted with it are readable.

Symmetric encryption is extremely fast — modern CPUs execute AES-256-GCM at multi-gigabit speeds using hardware acceleration (the AES-NI instruction set present in virtually every x86 CPU since 2010). This makes it appropriate for bulk data encryption: files, disk contents, network streams, database fields.

AES: How a Block Cipher Works

AES (Advanced Encryption Standard) was standardized by NIST in 2001 after an open international competition. It operates on 128-bit blocks of data and supports 128, 192, or 256-bit key sizes. AES-256 is the current standard choice for high-security applications.

The algorithm applies a series of mathematical operations in rounds (10 rounds for AES-128, 14 for AES-256). Each round:

SubBytes: Each byte of the data block is replaced with a corresponding byte from a fixed substitution table (the S-box). This adds non-linearity — the mathematical property that prevents the cipher from being broken with linear algebra.

ShiftRows: Rows of the 4×4 byte matrix are cyclically shifted. Row 0 is not shifted. Row 1 shifts left by 1. Row 2 shifts left by 2. Row 3 shifts left by 3. This spreads bytes across different columns.

MixColumns: Each column of four bytes is treated as a polynomial over a Galois field and multiplied by a fixed matrix. This provides diffusion — one byte of input affects all four bytes in the output column.

AddRoundKey: Each byte of the block is XOR'd with the corresponding byte of the current round key (derived from the main key via the key schedule). This is where the key actually enters the computation.

After all rounds, the result is the ciphertext. A single bit change in the key or plaintext completely changes every bit of the ciphertext — this is called the avalanche effect.

AES Modes: The Critical Choice After the Cipher

AES is a block cipher — it encrypts exactly 128 bits at a time. Real data is longer. The mode of operation determines how multiple blocks are processed together. The mode matters as much as the cipher.

ECB (Electronic Codebook) — Never Use:

ECB encrypts each 128-bit block independently with the same key. The critical flaw: identical plaintext blocks produce identical ciphertext blocks. The structure of the original data leaks into the ciphertext.

# ECB's fatal flaw visualized
# Image encrypted with ECB still shows the original structure
# because repeated pixel patterns produce repeated ciphertext blocks
 
from Crypto.Cipher import AES
from PIL import Image
import struct
 
def encrypt_ecb_demo(image_path):
    key = b'this-is-16-bytes'  # 128-bit key
    cipher = AES.new(key, AES.MODE_ECB)
    img = Image.open(image_path).convert('RGB')
    data = img.tobytes()
    # Pad to AES block size
    padded = data + b'\x00' * (-len(data) % 16)
    # Encrypt each block independently — identical pixel patterns repeat
    encrypted = cipher.encrypt(padded)
    # The resulting image still visually reveals the original shapes

The famous demonstration: encrypt the Linux penguin mascot Tux with ECB mode. The outline and shape are completely recognizable in the "encrypted" image.

CBC (Cipher Block Chaining) — Acceptable but Superseded:

CBC XORs each plaintext block with the previous ciphertext block before encrypting. This breaks the pattern problem. Requires an Initialization Vector (IV) — a random value for the first block.

from Crypto.Cipher import AES
import os
 
key = os.urandom(32)  # 256-bit key
iv = os.urandom(16)   # Random IV (must be unique per message)
 
cipher = AES.new(key, AES.MODE_CBC, iv)
# Pad plaintext to block boundary
plaintext = b"Secret message  "  # Must be multiple of 16 bytes
ciphertext = cipher.encrypt(plaintext)
 
# Decrypt
cipher_dec = AES.new(key, AES.MODE_CBC, iv)
plaintext_recovered = cipher_dec.decrypt(ciphertext)

CBC vulnerabilities: BEAST (Browser Exploit Against SSL/TLS), POODLE, and padding oracle attacks have all exploited CBC in TLS implementations. CBC is acceptable for file encryption but deprecated for network protocols.

GCM (Galois/Counter Mode) — The Current Standard:

GCM combines CTR mode (which turns AES into a stream cipher, allowing arbitrary-length data encryption) with Galois field authentication. The result is Authenticated Encryption with Associated Data (AEAD) — it simultaneously provides:

Confidentiality: the ciphertext cannot be decrypted without the key
Integrity: any modification to the ciphertext is detected — decryption fails with an authentication tag mismatch

from Crypto.Cipher import AES
import os
 
key = os.urandom(32)  # 256-bit key
nonce = os.urandom(12)  # 96-bit nonce (MUST be unique per message with same key)
# If you reuse a nonce with the same key, AES-GCM's security breaks completely
 
plaintext = b"Sensitive financial transaction data"
associated_data = b"transaction-id-12345"  # Authenticated but not encrypted
 
# Encrypt and authenticate
cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
cipher.update(associated_data)  # Authenticate but don't encrypt this
ciphertext, auth_tag = cipher.encrypt_and_digest(plaintext)
 
# Stored/transmitted: nonce + ciphertext + auth_tag
# (The nonce does not need to be secret, but must be unique)
 
# Decrypt and verify
cipher_dec = AES.new(key, AES.MODE_GCM, nonce=nonce)
cipher_dec.update(associated_data)
try:
    plaintext_recovered = cipher_dec.decrypt_and_verify(ciphertext, auth_tag)
    print("Authentic and decrypted:", plaintext_recovered)
except ValueError:
    print("AUTHENTICATION FAILED — data was tampered with")

GCM is the correct default for all new symmetric encryption. If you are using a cryptography library and see options for mode, choose GCM. If the library handles the mode internally (like libsodium's crypto_secretbox), trust its defaults.

| Mode | Pattern Hiding | Authentication | Parallelizable | Use Case | |---|---|---|---|---| | ECB | No | No | Yes | Never. Not ever. | | CBC | Yes | No | Decrypt only | Legacy file encryption | | CTR | Yes | No | Yes | Streaming, but no authentication | | GCM | Yes | Yes | Yes | Everything — the standard choice | | ChaCha20-Poly1305 | Yes | Yes | Yes | Where AES-NI is unavailable (mobile) |

The Nonce Reuse Catastrophe

AES-GCM requires a unique nonce (Number used ONCE) for every message encrypted with the same key. If you reuse a nonce with the same key, catastrophe:

# CATASTROPHIC NONCE REUSE
key = os.urandom(32)
nonce = b'\x00' * 12  # Same nonce used twice with same key
 
cipher1 = AES.new(key, AES.MODE_GCM, nonce=nonce)
ct1, _ = cipher1.encrypt_and_digest(b"message one here!")
 
cipher2 = AES.new(key, AES.MODE_GCM, nonce=nonce)
ct2, _ = cipher2.encrypt_and_digest(b"different message")
 
# An attacker who has both ciphertexts can XOR them:
# ct1 XOR ct2 = plaintext1 XOR plaintext2
# With known-plaintext or crib-dragging, both messages are recoverable
 
# The Sony PlayStation 3 used the same nonce for every ECDSA signature
# This allowed recovery of the private key from just two signatures
# (CVE-2010-4112) — the same mathematical vulnerability applies

The Sony PS3 hack in 2010 by the fail0verflow team exploited exactly this. Sony's ECDSA signature implementation used a fixed "random" nonce k = 0 for every signature. Two signatures with the same k is sufficient to algebraically recover the private key. The entire security of the PS3's DRM was defeated by this single implementation mistake.

The lesson: nonce generation must use a cryptographically secure random number generator, and nonce uniqueness must be architecturally guaranteed — not assumed.

Asymmetric Encryption: The Key Distribution Solution

Symmetric encryption has one fundamental problem: how do two parties who have never communicated establish a shared key over a public channel without an eavesdropper capturing it?

If Alice wants to send Bob an encrypted message but they have never met, she cannot send him the key — an eavesdropper would capture it. She cannot encrypt the key without already having a key. This is the key distribution problem.

Asymmetric encryption solves it by using mathematically linked key pairs:

A public key that can be freely distributed to anyone
A private key that must be kept secret

Bob has:    bob_public_key (published on keyserver, email signature, website)
            bob_private_key (never shared, never transmitted)

Alice wants to send Bob an encrypted message:
1. Alice retrieves bob_public_key (from anywhere — no secrecy needed)
2. Alice encrypts her message with bob_public_key
3. Ciphertext is transmitted over untrusted network
4. Bob decrypts with bob_private_key

Only Bob can decrypt — because only Bob has bob_private_key.
The eavesdropper captured bob_public_key and the ciphertext — useless without bob_private_key.

Asymmetric encryption also enables digital signatures (the reverse operation):

Bob signs a message:
1. Bob signs with bob_private_key
2. Signature sent alongside message

Anyone verifies with bob_public_key:
- Valid signature → message came from whoever has bob_private_key
  AND the message has not been modified since signing
- Invalid signature → forgery or tampering

RSA: The Mathematics of Factoring

RSA (Rivest–Shamir–Adleman, 1977) is based on the integer factorization problem. Multiplying two large prime numbers is fast. Factoring the result back into its prime components is computationally infeasible at sufficient key sizes.

Key generation:

Choose two large random primes p and q (each ~1024 bits for RSA-2048)
Compute n = p × q (the modulus, part of both keys)
Compute Euler's totient: φ(n) = (p-1)(q-1)
Choose public exponent e = 65537 (standard choice — prime, has desirable properties)
Compute private exponent d such that e × d ≡ 1 (mod φ(n))

Public key: (n, e) — the modulus and exponent Private key: (n, d) — the modulus and private exponent

Encryption: ciphertext = plaintext^e mod n Decryption: plaintext = ciphertext^d mod n

The security: computing d from n and e requires knowing φ(n), which requires factoring n into p and q. For RSA-2048, this is computationally infeasible with classical computers — the best known classical factoring algorithms would take thousands of years on all computing power in existence.

# RSA encryption/decryption (Python, using cryptography library)
from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import hashes, serialization
 
# Generate RSA-2048 key pair
private_key = rsa.generate_private_key(
    public_exponent=65537,
    key_size=2048  # 4096 for long-term secrets, 2048 is minimum acceptable
)
public_key = private_key.public_key()
 
# Encrypt with public key (OAEP padding is required — raw RSA is insecure)
message = b"Encrypted with recipient's public key"
ciphertext = public_key.encrypt(
    message,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)
 
# Decrypt with private key
plaintext = private_key.decrypt(
    ciphertext,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)
assert plaintext == message
 
# Digital signature
from cryptography.hazmat.primitives.asymmetric import padding as sig_padding
signature = private_key.sign(
    message,
    sig_padding.PSS(
        mgf=sig_padding.MGF1(hashes.SHA256()),
        salt_length=sig_padding.PSS.MAX_LENGTH
    ),
    hashes.SHA256()
)
 
# Verify signature
try:
    public_key.verify(
        signature,
        message,
        sig_padding.PSS(
            mgf=sig_padding.MGF1(hashes.SHA256()),
            salt_length=sig_padding.PSS.MAX_LENGTH
        ),
        hashes.SHA256()
    )
    print("Signature valid")
except Exception:
    print("Signature invalid — message was tampered with or wrong key")

RSA key sizes and their security margins:

| Key Size | Status | Notes | |---|---|---| | RSA-512 | Broken | Factored in 1999; crackable by anyone today | | RSA-768 | Broken | Factored in 2009; crackable with moderate resources | | RSA-1024 | Weak | Not yet factored, but within range of well-resourced attackers | | RSA-2048 | Current minimum | Safe through approximately 2030 by NIST estimates | | RSA-3072 | Recommended | Comparable to 128-bit symmetric security | | RSA-4096 | Future-proofing | For data that must remain secure beyond 2030 |

RSA's practical limitation: RSA operations are orders of magnitude slower than AES. You should never encrypt bulk data with RSA directly. Instead, RSA encrypts a randomly generated symmetric key, which then encrypts the actual data. This is the hybrid encryption approach that TLS uses.

Elliptic Curve Cryptography: Smaller, Faster, Equivalent Security

ECC provides equivalent cryptographic security to RSA at dramatically smaller key sizes, which means faster operations and smaller data overhead:

| ECC Key Size | RSA Equivalent | Use Case | |---|---|---| | 256-bit | RSA-3072 | Standard for TLS, code signing, JWT | | 384-bit | RSA-7680 | High-security applications, government | | 521-bit | RSA-15360 | Maximum security, rarely necessary |

The mathematics: instead of integer factorization, ECC's security rests on the discrete logarithm problem in an elliptic curve group. Given a point P on a curve and a point Q = k×P (P added to itself k times), finding k is computationally infeasible.

# ECDSA (Elliptic Curve Digital Signature Algorithm) — standard for TLS, SSH, Bitcoin
from cryptography.hazmat.primitives.asymmetric import ec
from cryptography.hazmat.primitives import hashes
 
# Generate ECDSA key pair using NIST P-256 curve (most common)
private_key = ec.generate_private_key(ec.SECP256R1())  # P-256
public_key = private_key.public_key()
 
# Sign
message = b"Message to sign"
signature = private_key.sign(message, ec.ECDSA(hashes.SHA256()))
 
# Verify
public_key.verify(signature, message, ec.ECDSA(hashes.SHA256()))
 
# ECDH key exchange (like Diffie-Hellman but on elliptic curve)
# Both parties generate ephemeral key pairs
# They exchange public keys and derive the same shared secret
from cryptography.hazmat.primitives.asymmetric.x25519 import X25519PrivateKey
 
alice_private = X25519PrivateKey.generate()
alice_public = alice_private.public_key()
 
bob_private = X25519PrivateKey.generate()
bob_public = bob_private.public_key()
 
# Each party derives the same shared secret from their private key + other's public key
alice_shared = alice_private.exchange(bob_public)
bob_shared = bob_private.exchange(alice_public)
 
assert alice_shared == bob_shared  # Same shared secret, never transmitted

X25519 (Curve25519 in Diffie-Hellman mode) is the recommended curve for key exchange in new systems. It has a simpler implementation than NIST curves, is faster, and has stronger security guarantees against implementation side-channels. WireGuard, Signal, TLS 1.3, and SSH all use X25519.

TLS: Symmetric and Asymmetric Working Together

TLS (Transport Layer Security) is the protocol that puts the S in HTTPS. It uses both symmetric and asymmetric cryptography together — asymmetric for the key exchange (authenticated key establishment), symmetric for bulk data encryption (fast).

TLS 1.3 Handshake: A Complete Walk-through

TLS 1.3 (standardized in RFC 8446, August 2018) eliminated all the vulnerable features of previous TLS versions and simplified the handshake. It is the only TLS version you should accept for new deployments.

Client                                Server
  |                                      |
  |-- ClientHello ---------------------->|
  |   (TLS 1.3, supported cipher suites, |
  |    Client's ephemeral key share)     |
  |                                      |
  |<-- ServerHello --------------------- |
  |    (selected cipher suite,           |
  |     Server's ephemeral key share)    |
  |                                      |
  |  [Both sides now compute the same    |
  |   shared secret from ephemeral keys] |
  |                                      |
  |<-- {Certificate} ------------------ |  (encrypted with handshake key)
  |    (Server's public key + cert chain)|
  |<-- {CertificateVerify} ------------ |  (server proves it has cert's private key)
  |<-- {Finished} --------------------- |  (handshake verification)
  |                                      |
  |-- {Finished} ---------------------->|
  |                                      |
  |  [Application data now encrypted    |
  |   with application traffic keys]    |
  |<====== HTTPS application data =====>|

Key properties:

Forward secrecy by default. TLS 1.3 mandates ephemeral Diffie-Hellman (ECDHE or DHE). Even if the server's certificate private key is compromised later, past sessions cannot be decrypted.
No RSA key exchange. Previous TLS versions allowed "RSA key exchange" where the client encrypted the session key with the server's public key. If the private key was ever compromised (even years later), all past sessions could be decrypted. TLS 1.3 eliminates this option entirely.
Removed insecure options. TLS 1.3 removed all the cipher suites that have been attacked over the years: RC4, DES, 3DES, null ciphers, export-grade ciphers, anonymous key exchange.
0-RTT resumption. Returning sessions can send data immediately without a full handshake, at the cost of replay attack vulnerability for the 0-RTT data (mitigated by idempotency requirements).

Configuring TLS correctly for nginx:

# /etc/nginx/nginx.conf or server block
server {
    listen 443 ssl http2;
    server_name example.com;
 
    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
 
    # TLS versions — 1.3 only, falling back to 1.2 for older clients
    ssl_protocols TLSv1.2 TLSv1.3;
 
    # Cipher suites for TLS 1.2 (TLS 1.3 cipher suites are automatic)
    ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305';
    ssl_prefer_server_ciphers off;  # TLS 1.3 doesn't use this; off is correct
 
    # ECDH curve — X25519 is the preferred choice
    ssl_ecdh_curve X25519:prime256v1:secp384r1;
 
    # Session tickets (disable for perfect forward secrecy compliance)
    ssl_session_tickets off;
 
    # HSTS — force HTTPS for 1 year, include subdomains
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
 
    # OCSP stapling — server provides certificate validity proof, reduces handshake overhead
    ssl_stapling on;
    ssl_stapling_verify on;
    resolver 1.1.1.1 8.8.8.8 valid=300s;
 
    # Verify your configuration with SSL Labs
    # https://www.ssllabs.com/ssltest/ should show A+
}

Testing TLS configuration:

# Test TLS configuration with nmap
nmap --script ssl-enum-ciphers -p 443 example.com
 
# Check which TLS versions are supported
nmap --script ssl-enum-ciphers -p 443 example.com | grep -E "TLS|SSL"
# You should see only TLSv1.3 and TLSv1.2
# TLSv1.0, TLSv1.1, SSLv3, SSLv2 should be ABSENT
 
# Check for certificate validity and chain
openssl s_client -connect example.com:443 -tls1_3 </dev/null
# Shows: certificate chain, cipher suite, TLS version
 
# Verify perfect forward secrecy (ECDHE should be in cipher suite)
openssl s_client -connect example.com:443 2>&1 | grep "Cipher"
# Should show something like: Cipher is TLS_AES_256_GCM_SHA384 (TLS 1.3)
# or ECDHE-RSA-AES256-GCM-SHA384 (TLS 1.2)
 
# Test HSTS
curl -I https://example.com | grep -i "strict-transport"
# Should show: Strict-Transport-Security: max-age=31536000; includeSubDomains
 
# Full automated scan
# testssl.sh is comprehensive and free
docker run --rm drwetter/testssl.sh example.com

Certificate Infrastructure: The Trust Problem

A TLS certificate binds a domain name to a public key and is signed by a Certificate Authority (CA). Your browser ships with approximately 150 trusted root CAs — organizations that have been trusted by operating system vendors to issue certificates for any domain.

The certificate chain:

Root CA (self-signed, in browser trust store)
    └── Intermediate CA (signed by Root CA)
            └── Server Certificate (example.com, signed by Intermediate CA)

When you visit example.com, your browser:

Receives the server's certificate (contains domain name, public key, issuing CA, validity dates)
Verifies the certificate is signed by a trusted intermediate CA
Verifies the intermediate CA is signed by a trusted root CA
Verifies the domain name matches the certificate's Common Name or Subject Alternative Name
Checks the certificate has not expired or been revoked (via OCSP or CRL)

The PKI (Public Key Infrastructure) model has known weaknesses:

Any of the ~150 trusted root CAs can issue a certificate for any domain
DigiNotar (Netherlands) was compromised in 2011 and fraudulent certificates were issued for google.com, which were used to surveil Iranian internet users. DigiNotar was subsequently distrusted and ceased operations.
Let's Encrypt made TLS certificates free in 2015, eliminating the economics barrier to HTTPS but also meaning fraudulent-domain certificates are trivially obtainable

Certificate Transparency (CT) mitigates rogue certificate issuance: all publicly trusted CAs are required to submit every issued certificate to public CT logs. Any certificate issued for your domain that you did not request is detectable:

# Monitor for unauthorized certificates on your domain
# crt.sh is a public CT log search interface
curl "https://crt.sh/?q=%.example.com&output=json" | \
  jq -r '.[].name_value' | sort -u
 
# This shows every TLS certificate ever issued for any subdomain of example.com
# Review for any unexpected entries

End-to-End Encryption: Where the Server Can't Read Your Messages

HTTPS encrypts your traffic between your browser and the web server. The server decrypts everything it receives — it must, in order to process your request and serve a response. This means the server operator (and anyone with legal access to the server) can read what you send.

End-to-end encryption (E2EE) means only the two communicating parties can read the messages. The server relays encrypted messages but holds no keys. Even if the server is subpoenaed, hacked, or the operator is compelled to cooperate, message content is inaccessible.

The Signal Protocol

Signal Protocol (developed by Moxie Marlinspike and Trevor Perrin at Open Whisper Systems, open-sourced 2013) is the current standard for E2EE messaging. It is used by Signal, WhatsApp (end-to-end encryption layer), Facebook Messenger (secret conversations), and Google Messages (RCS encryption).

The protocol uses two mechanisms together:

X3DH (Extended Triple Diffie-Hellman): Establishes an initial shared secret between two users who may not be online simultaneously. Each user publishes a set of public keys to the server:

Identity key (long-term)
Signed prekey (medium-term, rotated)
One-time prekeys (used once, then discarded)

When Alice wants to message Bob for the first time, she fetches Bob's public keys from the server and computes a shared secret without Bob being online. Bob can compute the same secret when he comes online using Alice's identity key.

Double Ratchet Algorithm: After the initial X3DH setup, every message advances the key state:

[Session Key 0] → Message 1 key derivation → [Session Key 1] → Message 2 key derivation → ...

Each message uses a key derived from the previous state.
If a current key is compromised, past messages are safe (forward secrecy).
If a current key is compromised, future messages become safe after the ratchet
recovers through new Diffie-Hellman exchanges (break-in recovery).

The practical implication: a server breach that exposes Signal's databases exposes metadata (who talked to whom, when) and cipher text. The cipher text cannot be decrypted — the keys were never on the server. A device seizure exposes messages on that device only — not messages on other devices in the conversation.

Alice's Device  ──[E2EE Message]──  Signal Server  ──[E2EE Message]──  Bob's Device
                                         ↑
                                  Sees only encrypted bytes
                                  Cannot decrypt — has no keys
                                  Even under legal compulsion: no plaintext

What E2EE does not protect:

Endpoint compromise. If Alice's device is hacked, the attacker reads messages there — already decrypted for Alice's use. E2EE protects the channel; it does not protect the endpoints.
Metadata. Signal knows Alice and Bob exchanged messages at a specific time. (Signal's Sealed Sender feature reduces even this.) WhatsApp knows contact graphs. Metadata can be as revealing as content.
Backups. iCloud backups of iMessage historically synced messages unencrypted. iOS 16+ Advanced Data Protection encrypts iCloud backups with keys you control, but requires explicit opt-in.

Password Hashing: Why LinkedIn's Breach Was Catastrophic and Dropbox's Was Not

Password storage is the most common cryptography implementation that developers get wrong, with direct consequences for users.

The problem: Storing passwords in plaintext is obvious disaster — a database breach exposes all passwords. But encryption is also wrong — you would have to store the decryption key somewhere, and if the database is compromised, the key is too.

The correct approach: Use a one-way function that is deliberately slow to compute. Store the output (hash). On login, hash the provided password and compare.

Why regular hash functions (MD5, SHA-1, SHA-256) are wrong for passwords:

# SHA-256 runs at approximately 8 billion hashes per second on a modern GPU
# A GPU can check 8 billion password candidates per second against a SHA-256 hash
# Every common password is cracked in under a second
# 8-character random alphanumeric (62^8 = 218 trillion options): ~7 hours
 
# LinkedIn used unsalted SHA-1
# SHA-1 is even faster than SHA-256 — multiple billions per second on GPU
# Plus without salt, identical passwords produce identical hashes
# One rainbow table lookup cracks all instances of the same password simultaneously
 
# This is why 90% of the LinkedIn passwords were cracked within 72 hours

Password-specific hashing algorithms are designed to be deliberately slow:

import bcrypt
import hashlib
import time
 
# Compare: SHA-256 (wrong for passwords)
start = time.time()
for _ in range(10000):
    hashlib.sha256(b"password123").digest()
sha_time = (time.time() - start) / 10000
print(f"SHA-256 per hash: {sha_time*1000:.4f}ms")  # ~0.001ms
 
# bcrypt (right for passwords — much slower by design)
start = time.time()
for _ in range(10):
    bcrypt.hashpw(b"password123", bcrypt.gensalt(rounds=12))
bcrypt_time = (time.time() - start) / 10
print(f"bcrypt per hash: {bcrypt_time*1000:.1f}ms")  # ~250ms
 
# At 250ms per hash:
# An attacker cracking bcrypt hashes can try 4 guesses/second (on your hardware)
# Compare to SHA-256: 8 billion/second on GPU
# Factor: 2,000,000,000x slower = bcrypt is 2 billion times harder to crack
 
# Dropbox used bcrypt — their breach was dramatically less damaging
# despite 68 million hashes being exposed

Argon2: The Current Best Practice

Argon2 won the Password Hashing Competition in 2015 and is the current recommended algorithm. It has three variants:

Argon2d: Resistant to GPU attacks; vulnerable to side-channel attacks. Use for server-side hashing where side-channels are not a concern.
Argon2i: Resistant to side-channel attacks. Use where side-channel attacks are a concern.
Argon2id: Hybrid — recommended for password hashing in most applications.

Argon2id allows tuning three parameters:

Memory cost (m): Amount of RAM required. More RAM = harder for GPU/ASIC attacks.
Time cost (t): Number of iterations.
Parallelism (p): Number of threads.

from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
 
# Create hasher with OWASP-recommended parameters for 2024
# These values make each hash take ~0.5s and require 64MB of RAM
ph = PasswordHasher(
    time_cost=3,        # 3 iterations
    memory_cost=65536,  # 64 MB of RAM
    parallelism=4,      # 4 threads
    hash_len=32,        # 256-bit hash output
    salt_len=16         # 128-bit salt (generated automatically per password)
)
 
# Hash a password
password = "user-provided-password"
hash = ph.hash(password)
# hash looks like: $argon2id$v=19$m=65536,t=3,p=4$[salt]$[hash]
# Salt is stored in the hash string — no need to store it separately
 
# Verify a password against a stored hash
try:
    ph.verify(hash, password)
    print("Password correct")
    # Check if the hash parameters are outdated (rehash if needed)
    if ph.check_needs_rehash(hash):
        new_hash = ph.hash(password)
        # Store new_hash in database
except VerifyMismatchError:
    print("Password incorrect")

Password storage comparison:

| Algorithm | Status | Cracking Speed (GPU) | Use For | |---|---|---|---| | Plaintext | Never | Instant | Nothing | | MD5 | Broken | ~50 billion/sec | Nothing | | SHA-1 | Broken | ~8 billion/sec | Nothing | | SHA-256 unsalted | Wrong | ~8 billion/sec | Nothing | | SHA-256 salted | Wrong | ~8 billion/sec | Nothing | | bcrypt (rounds=12) | Acceptable | ~4/sec | Legacy systems | | scrypt | Good | ~0.5/sec | Acceptable | | Argon2id | Best practice | under 0.1/sec | All new systems |

Hashing vs. Encryption: The Critical Distinction

The LinkedIn breach happened, in part, because someone made a decision to hash passwords rather than encrypt them — which was correct. The Dropbox breach was less damaging because someone chose bcrypt over SHA-1 — also correct. Many other breaches have happened because developers confused these operations.

HASHING:
  Input → [Hash Function] → Fixed-length digest
  Digest → ??? → Input (impossible — one-way)

ENCRYPTION:
  Input + Key → [Cipher] → Ciphertext
  Ciphertext + Key → [Cipher] → Input (reversible — requires key)

Use hashing when you need one-way verification:

Password storage (never need to reverse — just compare hashes)
File integrity verification (download the file, hash it, compare to published hash)
Checksums for data corruption detection
Signing: hash the message, sign the hash (faster than signing the full message)

Use encryption when you need to recover the original value:

Files and messages you need to read later
Database fields storing sensitive data that must be queried
Encrypted sessions and secure channels

The dangerous confusion pattern:

Some developers store credit card numbers "hashed for security." Hashing a credit card number means:

You cannot recover the actual card number (useless for charging customers)
A determined attacker can crack the hash by iterating through all 16-digit card numbers (a finite, structured space)
MD5(4111111111111111) is crackable in milliseconds because credit card numbers follow known patterns

The correct solution: encrypt card numbers with AES-256-GCM using a key stored in a dedicated secrets manager (AWS KMS, Vault, etc.). The cipher text is stored in the database. The decryption key is stored elsewhere. A database breach without the key exposes useless cipher text.

Common Cryptographic Failures in Real Systems

The Debian OpenSSL Weak Key Generation (2008)

In September 2006, a Debian developer modified the OpenSSL package to silence a memory error checker warning. The patch removed two lines that were critical for entropy seeding the random number generator. The result: OpenSSL on Debian (and Ubuntu, which derives from Debian) generated cryptographic keys with only 32,767 possible values — regardless of the intended key size.

Every SSH key, TLS certificate, and OpenVPN key generated on affected Debian systems between September 2006 and May 2008 was one of only 32,767 keys. An attacker could enumerate all possible keys in a few hours. If you were using an affected key, your "2048-bit RSA key" provided the security of approximately 15 bits.

CVE-2008-0166. Discoverable by generating all 32,767 keys of a given type and comparing.

The lesson: cryptographic security depends on randomness, and randomness must come from a cryptographically secure source. Never silence or modify entropy-critical code.

RC4 in TLS/WEP

RC4 is a stream cipher that was once widely used in WEP WiFi encryption and as a TLS cipher. It has multiple statistical biases — certain byte values appear more often than they should in the keystream. These biases enable attacks:

WEP (uses RC4): crackable with 60,000-100,000 captured packets using aircrack-ng. The Fluhrer, Mantin, Shamir (FMS) attack from 2001 turned WEP from "maybe secure" to "completely insecure."
RC4 in TLS: the BEAST attack (2011) and subsequent research made RC4 in TLS exploitable for session decryption with sufficient captured traffic.

RC4 was officially deprecated in RFC 7465 (February 2015). Any system still using RC4 is running cryptography that has been known broken for over a decade.

Timing Attacks: Side-Channel Vulnerabilities

Even when the algorithm is correctly implemented, the time it takes to execute can leak information.

# VULNERABLE: String comparison with early exit
def check_api_key(provided_key, stored_key):
    # This comparison exits at the first mismatch
    # An attacker timing the response can measure character-by-character
    # how far through the key their guess matched
    return provided_key == stored_key  # NOT constant time
 
# SAFE: Constant-time comparison
import hmac
def check_api_key_safe(provided_key, stored_key):
    # hmac.compare_digest always compares all bytes, regardless of where mismatch occurs
    # Same time whether the first character matches or none of them do
    return hmac.compare_digest(
        provided_key.encode('utf-8'),
        stored_key.encode('utf-8')
    )

The timing attack was exploited in practice against Lucky Thirteen (CBC MAC in TLS, 2013), BEAST, and various MAC verification vulnerabilities. All cryptographic comparison operations must be constant-time.

Practical Cryptography Decision Guide

When building a system that needs cryptographic protection:

Storing passwords:

# Use Argon2id — no exceptions
from argon2 import PasswordHasher
ph = PasswordHasher(time_cost=3, memory_cost=65536, parallelism=4)
hash = ph.hash(user_password)

Encrypting data at rest (files, database fields, backups):

# Use AES-256-GCM with a randomly generated nonce per message
from cryptography.fernet import Fernet
# Fernet handles AES-128-CBC + HMAC under the hood (acceptable)
 
# Or use libsodium via PyNaCl for modern AEAD:
import nacl.secret
import nacl.utils
key = nacl.utils.random(nacl.secret.SecretBox.KEY_SIZE)  # 256-bit
box = nacl.secret.SecretBox(key)
encrypted = box.encrypt(b"sensitive data")
decrypted = box.decrypt(encrypted)

Generating cryptographic keys and tokens:

import secrets
# For API keys, session tokens, password reset tokens:
token = secrets.token_urlsafe(32)  # 256-bit entropy URL-safe string
api_key = secrets.token_hex(32)    # 256-bit hex string
 
# Never use random.random() for security — it is not cryptographically secure
import random
bad_token = str(random.random())  # WRONG — predictable

Verifying integrity (file checksums, HMAC for API authentication):

import hmac
import hashlib
 
# HMAC for API request authentication
secret_key = b"your-secret-key"
message = b"API request body"
mac = hmac.new(secret_key, message, hashlib.sha256).hexdigest()
# Include mac in request; server recomputes and compares
# This proves the request came from someone with the secret key and was not modified

Secure communication channels:

Use TLS 1.3. Let the TLS library handle everything.
Do not implement cryptographic protocols yourself.
Do not write your own AES implementation.
Use established libraries: OpenSSL, libsodium, the cryptography Python library, Java's JCA/JCE.
The hard part is not the algorithm — it is the protocol design, the key management,
and the many ways to incorrectly use a correct algorithm.

Warning

Never implement cryptographic primitives from scratch for production use. The algorithms are public — the security is in the implementation details: constant-time comparisons, proper padding, correct nonce handling, secure random number generation, and dozens of other subtleties that academic papers have been published about for decades. Use audited libraries. When in doubt, use libsodium, which makes the safe choice the default choice.

A Quick Reference for Common Scenarios

| Scenario | Right Tool | Wrong Tool | |---|---|---| | Storing passwords | Argon2id, bcrypt (rounds≥12) | MD5, SHA-256, plaintext | | File encryption | AES-256-GCM | ECB mode, RC4, DES | | Message authentication | HMAC-SHA256 | Non-keyed hash, CRC | | Secure channel | TLS 1.3 | TLS 1.0/1.1, SSLv3 | | Key exchange | X25519 (ECDH), ECDHE-P256 | RSA key exchange | | Digital signatures | ECDSA/P256, Ed25519 | MD5withRSA, SHA1withRSA | | Password reset tokens | secrets.token_urlsafe(32) | random.random(), timestamp | | Data integrity check | SHA-256, SHA-3 | MD5, SHA-1 | | Random encryption key | os.urandom(32) | Date-seeded random |

The cryptographic algorithms protecting your bank transfers, medical records, and private messages are public knowledge — AES, RSA, and ECDH are in textbooks and NIST publications. What keeps the data secure is the key you hold, the implementation that correctly uses these algorithms, and the key management practices that ensure the key stays secret. Get those three things right, and the mathematics guarantees the rest.