CTF/ECW/Cryptography Breaking an OTP (BMPaaS)

For this second write-up on the CTF ECW 2023 edition, I address the resolution of a challenge in the crypto category “BMPaaS”, in which we will have to exploit an implementation of an OTP type algorithm (One-Time Pad ).

Code analysis

import base64
import os


FLAG = base64.b85encode(open("flag.bmp", "rb").read()).decode()

CHARSET = base64._b85alphabet.decode()
N = len(CHARSET)


def generate_key(length):
    random_stream = os.urandom(length)
    return "".join(CHARSET[x % N] for x in random_stream)


def encrypt(plaintext, key):
    ciphertext = ""
    for i in range(len(plaintext)):
        ciphertext += CHARSET[(CHARSET.find(plaintext[i]) + CHARSET.find(key[i])) % N]
    return ciphertext


print("[+] Welcome to BMPaaS!")
print("[+] We offer a military-grade BMP encryption algorithm, powered by one of the safest OTP mechanisms in the world.")
print("[+] For now, uploading new BMP files is disabled. However, you can challenge our security by requesting an encrypted flag ;)\n")

print("[1] Request a new encrypted flag")
print("[2] Exit\n")

while True:

    choice = input("[?] Enter your choice: ")

    if choice == "1":
        key = generate_key(len(FLAG))
        print(encrypt(FLAG, key))

    elif choice == "2":
        exit()

    else:
        print("[x] Invalid choice.")

This is an OTP (One Time Pad ) type cryptographic algorithm, meaning that we have an encryption key (also called a mask) of the size of the data to be encrypted. The data here is a BMP image. The algorithm has the equation

c_{i} = p_{i} + k_{i} [mod N]
  • p is the plaintext encrypted in base85 (therefore composed of characters belonging to the charset of this base).
  • k is the encryption key (mask). It is composed of random bytes generated by urandom which have been modulated by N=85 ( generate_key function ).

We note several things. The urandom function is a secure function whose randomness cannot be brought into play. The algorithm (addition followed by the modulo) is also rather secure because the modulo is a one-way operation.

The flaw in theory

The flaw: urandom generates one byte, therefore 256 possibilities. Only 85 is not a divisor of 256 (85*3=255), which means that there are 4 numbers between 0 and 255 which modulo 85 gives 0. And there is the vulnerability…

Let’s talk a little about probability. Let X be the random variable which gives the number output from modulo 85. Let Y be the random variable which gives the number generated by urandom. As we admit that the urandom function is safe, we also admit that Y follows a uniform law with parameter n=256.

P(X=0) = P(Y=0) + P(Y=85) + P(Y=165) + P(Y=255) 
P(X=0)= \frac{1}{256} + \frac{1}{256} + \frac{1}{256} + \frac{1}{256} = \frac{4}{256}

So P(X=0) = 4/256. However, for all other numbers other than 0, the probability that X is equal to this number is 3/256. It is therefore more likely to obtain a 0 in the key than another number < 85. On a certain number of samples, we can then assume that a majority of them will have a key composed of the character found at the index 0, i.e. the character “0”.

If I take 50,000 samples ( yes that’s a lot! ), I will then have approximately (4/256)*50000 = 784 keys having the character “0”, compared to approximately 588 for all other characters in the base85 charset. We have therefore bypassed the randomness of the generate_key function.

If we take the cryptographic equation:

c_{i} = p_{i} + k_{i} [mod N]

We will then most often have k = 0, that is to say:

c_{i} = p_{i} [mod N]

Now as p < N because p comes from base85 encoding and therefore contains characters belonging to the base85 charset, the equation becomes:

c_{i} = p_{i}

​which suits us well.

The flaw in practice

We first recover a large number (50,000 to be sure) of samples ( init function ). For each encrypted character in the flag we recover its most frequent value in our samples ( get_most_char function ). We then just need to decode the data obtained in base85.

from pwn import *
import colorama


def sorted_dict(dictionnaire):
    dictionnaire_trie = dict(sorted(dictionnaire.items(), key=lambda item: item[1]))
    return dictionnaire_trie

def get_max_key(dictionnaire):
    if not dictionnaire:
        return None 
    max_valeur = max(dictionnaire.values()) 
    for cle, valeur in dictionnaire.items():
        if valeur == max_valeur:
            return cle

HOST = "instances.challenge-ecw.fr"
PORT = 39345

p = remote(HOST,PORT)

def init():
    ALL_DATA = []
    for i in range(50000):
        p.sendlineafter(b"[?] Enter your choice:",b"1")
        data = p.recvuntil("\n")[1:-1]
        ALL_DATA.append(data)
        print(f"\r-> {i}/15000",end='')


    # f = open("bmp3.dump","w")
    # f.write(str(ALL_DATA))
    # f.close()

init()

# f = open("bmp3.dump","r")
# ALL_DATA = eval(f.read())
# f.close()

def print_num():
    letter_1 = {}
    for i in range(len(ALL_DATA)):
        n = list(ALL_DATA[i])[0]
        if n in letter_1:
            letter_1[n] += 1
        else:
            letter_1[n] = 1

    print(sorted_dict(letter_1))

def get_most_char():
    x = ""
    for chara in range(len(ALL_DATA[0])):
        letter = {}
        print(f"\r-> getting letter {chara}/{len(ALL_DATA[0])}",end='')
        for i in range(len(ALL_DATA)):
            n = list(ALL_DATA[i])[chara]
            if n in letter:
                letter[n] += 1
            else:
                letter[n] = 1
        x += chr(get_max_key(letter))
    return x

x = get_most_char()
# x = "LQUQV00000004>r004Xd003CM000F500p3178tkO002}500HvU000vU000000W00000000{{R60009300000050BTqQgZ+R0000000000000000000000000000000000000000000v000000000000006200000000000K000000}000m007000000960|NsC000000000000)R90|NsC0iNj900RR9000030|NsC0|NsC0|Ns90090000(R4000000000000RR90|}sC0|Nj6000000|NsC0~Nj6000x0000030|NsC0{{R3000000000000RR9000030|NsC0{{330|NsC0|Nj600RR90|Ns90000000RR90|NsC0|Nj60D000000030|NsC0{{R300000000960{{R30|NWC0|Nj6000000|NsC0|Nj60000000000000960{{R300a00000960|NsC000030|NsC0{{R300000000960{{n30i000000960|NsC000f000000000000|NsC0|Nj6<0RR9000030|Nj6000000|Ns90000000RR90|NsC0|NsC0{{R30|NsC0|NsC0|NsC0|Ns9000960{{R30|Ns9000960|NsC0|N39000960|NsC000030|NsC0{QR30|Nd9000960|NsC000030|NsC0|NsC0|Ns9000960|NsC0|NsC0|NbC0|NsC000030|NsC0{{R30|NsC0|Nj600RR9200030|NsC0{{R30|Ns9000960iNsC000030|Nj600RR90|Ns9000960{{R3Q|Ns90009?0|NsC000030|NsC0|NBC0|NsC0|NsC0{{R{0|NsC0|Nj<0VRR9000030|Nj600R^90|Ns9000960|NsC0|Ns90000000RR90|NsC0|NuC0|msC0|NsC0|NsC0|NsC!|NsC0|NsC0{{R30|Ns9000000000000i030|NsCj{{R3}|Nsi0|NsC0|NsC000030|asC0{{z30|NsC0|Nj600RR900000000960|NsC000030FNsD0{{R30|NsC0|Nj$000000|NsC0|NjO0000000Z050|NsC0|NsC0ENsC0|NsC0|NsC000030|NsC0{{R30|NsC0|Nj600RR9600030|NsC0{$R30|Ns9000960|NsC00m030|Nj600RR90<Ns9000960{{R30|Ns9000960|NsC0000&0|NsCr|NsC0|Ns`0|NsC0{{R30|NsC0|Nj600RR9000030|Nje00RR90|Ns9000960{{R3000030|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0|Nj600RR9000030|NsC0{{R3000003000000RR90|NsC0PNsC0{{R30|NsC0|NsC0|NsC000030|NsC08NsC0|NVC0|N|600RR90|Ns9000960wNsC000000000000RR90|Ns900n960|NsC0+0032|Nj600RP90|NsC0|NsC0|NsC0|NsC0|NsC0|NsC000000*00007RR900000000960|NsC0|Ns9!000C00RR90|NsN0|N?600000000030|Nj600RR90|N{904096({{R30|NsCR1Nj6000K00UNsC0|8sC8|NsC0|NsC0|NsC0{{R300000000960|NsC0|NsC!|NsC0{{R300y0000096W|LsC000000000000RR90|NsC0|NmCP|NsC0|NsC0|Nj600RR9020030|NRC0{{R30rNs9000000000000000000960|NsC0}0000000000RR9000030|rsC^|NvC0|NsC0|NJ600RR90|Ns90000000RR9000030|NsC0|NsC0|NsC0|Nj6000000|NsC0|Nj6000i000000000960|NsC0|Ns80|NsC0=NsC0|NsA0|NsC0|NsC0|NsCz|NPC0|NsC0^NsC0|NsCm|NsC0|NsC0|NsC0|NsC000030|NsC0|NsC0|NsC0|NsC0{{R3g|NsC0|NsC0|NsC0|8YC0|NsC0|NsC0|NsC0|$sC0{{R30|NsC0|NsC0|NsC000030|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0|NcC0|NsC0|NsC0|NsC0|NsC0|Nj600RR9000030|Nj6000000|Ns900n0"

f = open("flag.bmp","wb")
f.write(base64.b85decode(x.encode()))
f.close()

Posted

in

,

by

Tags: