homeprojectswriting
The padding oracle attack, step by step
March 22, 2025

A major con of CBC is the so-called padding oracle attack.

Padding is often performed with PKCS#7 padding. If the end of your message requires bb bytes to fill the last block, PKCS#7 will append bb bytes of value bb to make everything whole.

This means that the receiver of a ciphertext coming from a CBC mode can throw an "invalid padding" error if it looks at the ciphertext's last byte bb and sees some byte in the last bb that doesn't also equal bb.

The padding oracle attack exploits malleability: the ability of an attacker to modify the ciphertext to a cryptosystem in a way that meaningfully changes the decrypted plaintext. CBC is malleable because changing ci1c_{i - 1} changes mi=Fk1(ci)ci1m_i = F_k^{-1}(c_i)\oplus c_{i - 1}.

As a result, a receiver that throws an invalid padding error (serves as a "padding oracle") in a CBC is prone to the padding oracle attack as follows:

Step 1: Recover the length of the padding.

Let c=c1c2clc = c_1\Vert c_2\Vert\cdots\Vert c_l. Change the first byte of cl1c_{l - 1}, which changes the first byte of mlm_l, and send the modified cc to the oracle O\mathcal{O}.

If an error occurs, we know we changed a byte (the first in the last block) that was part of the padding. We conclude that mlm_l consists entirely of padding. If no error occurs, the padding stayed intact despite our change, so change the next byte until we get an error. Once that happens, we know we have modified (hence, found) the first byte of padding.

Denote the distance of this byte to the end of the message by bb.

Part 2: Recover the last block.

We know the last bb bytes have value bb; we'll try to guess the last character before the padding. Pick a "trial byte" tt', a value used to determine the plaintext byte tt through oracle feedback. Let's query O\mathcal{O} on cc, replacing cl1c_{l - 1} with cl1c_{l - 1}', where

cl1=cl1(000tb~b~b~)b~=b(b+1). c_{l - 1}' = c_{l - 1}\oplus (00\ldots 0t'\tilde{b}\tilde{b}\ldots\tilde{b})\qquad \tilde{b} = b\oplus (b + 1).

So we replace the last non-padding byte with our guess tt' and each padding byte with b(b+1)b\oplus (b + 1).

Why? Decrypting the modified cc gives

mlml(000tb~b~b~)=Fk1(ci)cl1. m_l'\coloneqq m_l\oplus (00\ldots 0t'\tilde{b}\tilde{b}\ldots\tilde{b}) = F_k^{-1}(c_i)\oplus c_{l - 1}'.

Recall that the last bb bytes of mlm_l are bb. They evaluate to bb(b+1)=b+1b\oplus b\oplus (b + 1) = b + 1 by definition of b~\tilde{b}. Every other byte of mlm_l, except the last non-padding byte (call it tt), remains the same. That one becomes ttt\oplus t'.

Now recall the PKCS#7 invariant: the last byte's value is the number of suffix bytes with that value. So if mlm_l' has valid padding, its last b+1b + 1 bytes must be b+1b + 1, including ttt\oplus t'. Therefore, we can validate our guess tt' based on whether O\mathcal{O} gives a padding error. If it doesn't, then

tt=b+1t=(b+1)t.t\oplus t' = b + 1\Rightarrow t = (b + 1)\oplus t'.

So we recover tt. Of course, this strategy is not specific to b+1b + 1. We continue with b+2,b+3,b + 2, b + 3, \ldots, which gives us the bytes before tt and, therefore, the rest of block ll.

Part 3: Recover the rest of the blocks.

Recall that bb is at most the length of one block. To recover block l1l - 1, then, drop the last ciphertext block and change the last byte of cl1c_{l - 1} until there is no padding error.

Now that our ciphertext properly decrypts, repeat Part 1 to find the new padding of ml1m_{l - 1}, and repeat Part 2 to recover its bytes.

Repeat this until every block of the ciphertext is recovered.


Read more in my 475 notes.