Implementing and Breaking Repeating Key XOR in Go

8 min readDec 15, 2020

Since starting as a security engineer, I’ve been working my way through the Cryptopals challenges and writing about them as I go. Here we will be looking at implementing repeating key XOR, and then how to break it. These are challenges 5 and 6 if you’re following along.

Hide the data!

We’re going to start by using repeating key XOR to encrypt some plaintext. What does this mean? We’ve already gone over what it means to XOR something here, but I’ll give you a little refresher. When we XOR two bits, the result is 1 if one bit is 0 and the other bit is 1 , otherwise it’s 0.

BYTE                 BITS
A                    01000001
B                    01000010
XOR Result           00000011

With single character XOR, we XOR’d every byte of our plaintext against one single byte, but for repeating key XOR, we’re going to cycle through the bytes of a key to XOR against. Let’s take the first word of our example and do it by hand so you can see what I’m talking about. We have the word Burning, and we are using the key ICE.

B         u         r         n         i         n         g
01000010  01110101  01110101  01101110  01101001  01101110  01100111I         C         E         I         C         E         I
01001001  01000011  01000101  01001001  01000011  01000101  01001001RESULT:
00001011  00110110  00110000  00100111  00101010  00101011  00101110

That seems simple enough, now that we can do it by hand, let’s write some code in go to do it for us.

Encrypting with code

The code for this is going to be pretty straight forward, since Go has the handy ^ operator that will XOR two bytes. We just need to loop through our plaintext byte array, and XOR against the correct byte in our key. The code will look something like this.

func EncryptWithRepeatingKey(input, key []byte) []byte { 
    eb := make([]byte, len(input))
  
    for i := 0; i < len(input); i++ {  
        eb[i] = input[i] ^ key[i%len(key)] 
    } 
    return eb
}

The output of this probably won’t be human readable, so we’ll hex encode the output. Our full function (with the test input) will look like this:

package mainimport (
 "fmt"
 "encoding/hex"
)func main() {
   pt := []byte(`Burning 'em, if you ain't quick and nimble
I go crazy when I hear a cymbal`)
 
   res := EncryptWithRepeatingKey(pt, []byte("ICE"))
 
   ret := make([]byte, hex.EncodedLen(len(res)))
   hex.Encode(ret, res)
 
 
   fmt.Printf("%s", ret)
}func EncryptWithRepeatingKey(input, key []byte) []byte {
   eb := make([]byte, len(input))  for i := 0; i < len(input); i++ {
    eb[i] = input[i] ^ key[i%len(key)]
   }
   return eb
}

If you run that over in the Go Playground, then you will get the result

0b3637272a2b2e63622c2e69692a23693a2a3c6324202d623d63343c2a26226324272765272a282b2f20430a652e2c652a3124333a653e2b2027630c692b20283165286326302e27282f

which matches up with our desired outcome, but we’re only halfway there. Now that we know how to implement repeating key XOR, we need to figure out how to break it!

Break all the things

In Challenge 6 we’re given an input file that has been repeating key XOR encrypted with an unknown key, and then base64 encoded. The very first thing we need to do is decode our input from base64. We do this a lot throughout Cryptopals, so I’ve written a handy little helper function. If you want to read up more about base64 encoding, I’ve got a write up here!

func Decode (encodedBytes []byte) ([]byte, error) {
    retLength := base64.StdEncoding.DecodedLen(len(encodedBytes))
    ret := make([]byte,retLength)) 
    _, err := base64.StdEncoding.Decode(ret, encodedBytes) 
    if err != nil {  
        return nil, err 
    } 
    return ret, nil
}

Now that we’ve gotten that out of the way, we need to figure out the length of our key. They give us a hint on this.

Using the distance of your pigs to get the key length

The first hint that we’re given is that we’re going to need to find out the hamming distance between two inputs. What is a hamming distance you ask? The hamming distance is the number of bits that are different between two inputs. For example if want to take the hamming disance between t and e:

t     01110100
e     01100101

The hamming distance between those to characters is 2, because the 4th and 8th bits are different.

In the challenge they give us these strings

this is a testwokka wokka!!!

and tell us that the hamming distance should be 37. I could print out all the individual bits and we could count all the different ones, but we’re going to need to get code to do this for us. To do this, we are going to need to use two bitwise operators: &(and) and <<(left shift)

& returns 1 if both compared bits are 1, otherwise it returns a 0

<< is left shift, and it takes all the bits in a byte, shifts them over to the left, and tacks on 0 at the end to keep the byte 8 bits.

The code looks like this:

func getHammingDistance(input1, input2 []byte) int { 
    var d int  
    for i := 0; i < len(input1); i++ {  
        for j := 0; j < 8; j++ {   
            if input1[i]&(1<<uint(j)) != input2[i]&(1<<uint(j)) {
                d++   
            }  
        } 
    } 
    return d
}

Breaking it down, we’re going to loop through every byte in input 1 (and input 2, it would probably be a good idea to make sure they’re the same length, but we can fix that later). Let’s use our example of t and e again to illustrate exactly whats going on here.

Inputs:
input1 = [t] or [01110100]
input2 = [e] or [01100101]var distance intOuter Loop Iteration 1: 
i = 0Inner Loop Iteration 1:
j = 0//reminder that 1 = 00000001
input1[0] = 01110100
1 << 0    = 00000001
& result  = 00000000 // no bytes are both 1 input2[0] = 01100101
1 << 0    = 00000001
& result  = 00000001 // last position are both 100000000 != 00000001, so distance += 1Inner Loop Iteration 2:
j = 1input1[0] = 01110100
1 << 1    = 00000010
& result  = 00000000input2[0] = 01100101
1 << 1    = 00000010
& result  = 0000000000000000 == 00000000, so continue loop, don't increase distanceect...

So using that function, we can get the hamming distance between two blocks. How is this useful? Well, for each of our possible key lengths (remember, 2–40), we’re going to:

Take the first key length of bytes from our cypher text
Get the hamming distance between the that and the 2nd key length of bytes
Normalize the results dividing the hamming distance by the key length

So let’s start with a key length of 2. We’ll need 4 bytes (2 blocks of 2).

Block 1: 00011101 01000010 
Block 2: 00011111 01001101

Our hamming distance between those two blocks is going to be 5. Normalized for our key length is going to be 2.5

Wait, why does it matter how far apart my pigs are?

It all boils down that the ASCII characters make up a somewhat small range of bytes, and the fact that two letters XOR’d against the same character aren’t going to be that different. For example, lets take the string Hello, World and XOR encrypt it using the key ICE. We will get these bytes

00000001 00100110 00101001 00100101 00101100 01101001 01101001 00010100 00101010 00111011 00101111 00100001

We know our key length is 3, but let’s test out a couple key lengths.

Block Length 2
Block 1: 00000001 00100110
Block 2: 00101001 00100101
Normalized Hamming Distance: 2Block Length 3 // This is the one we used
Block 1: 00000001 00100110 00101001
Block 2: 00100101 00101100 01101001
Normalized Hamming Distance: 1.67Block Length 4
Block 1: 00000001 00100110 00101001 00100101
Block 2: 00101100 01101001 01101001 00010100
Normalized Hamming Distance: 2.75

Putting it together

We have code to get the hamming distance between two blocks, but we need a way to implement it. To get more accurate results, we’re going to get the hamming distance of the first 4 blocks instead of the first 2. We’ll store the results in an array of structs that stores the key length and hamming distance, and sort the results on hamming distance, shortest to longest.

type keyLength struct { 
    length int
    strength float64
}func getKeyLengths(input []byte) []keyLength { 
    kl := make([]keyLength, 0) 
    for len := 2; len < 40; len++ {  
        var hd float64  
        for i := 0; i < 4; i++ {   
            si := len * i   
            hd += float64(getHammingDistance(input[si:si+len], input[si+len:si+(len*2)]))  }              hd = hd / float64(len)  
            kl = append(kl, keyLength{strength: hd, length: len}) 
        } 
        sort.Slice(kl, func(i, j int) bool { 
             return kl[i].strength < kl[j].strength 
        }) 
    return kl
}

Are you the Key Master?

We’ve ranked our key lengths, but now what do we do with this information? Because we’re still kind of guessing the key length, I’m doing the following steps for our 5 strongest key lengths (those that have the lowest hamming distances).

Break the cypher text up so that all characters that were encrypted with the same character are together.
For each of those groups, use the code we have for decrypting single byte XOR.
Put all our strings back together.
Measure the character weight of the result
Highest character weight is our result

1. Break up our input

If we go back our example of Hello, World with the key ICE , we want all the characters that were encrypted using I in one chunk, all the characters that were encrypted using C in one chunk, and all the characters encrypted with E in another chunk.

So we would end up with the encrypted versions of

[H l  r] [e o ! l] [l , o d]

The code:

func makeByteChunks(keyLength int, encryptedBytes []byte) [][]byte { 
    chunks := make([][]byte, keyLength) 
    for i, c := range encryptedBytes {  
        chunks[i%keyLength] = append(chunks[i%keyLength], c) 
    } 
    return chunks
}

2. Use single byte XOR Decryption

We wrote code for doing this here. In a nutshell, we give the most commonly used letter in english a score based on how commonly it’s used. We XOR decrypt our cypher text with every character, and the one that produces the best score is our key.

3. Put the strings back together

Now that used our nifty cipher, we need to put our chunks of code back together in one string.

func rebuildString(unencChunks [][]byte) []byte { 
    r := make([]byte, 0) 
    for i := 0; i < len(unencChunks[0]); i++ {  
        for _, sl := range unencChunks {   
            if i < len(sl) {    
                r = append(r, sl[i])   
            }  
        } 
    } 
    return r
}

4. Measure our result

Since we’re trying this out with our 5 strongest key lengths, we need a way to know which one is actually our key length. To do this, we’re going to use the same character weight method we used in decrypting single byte XOR here.

5. Profit

After measuring the strength of each of our results, the one with the best score is our answer!

func Decrypt(encryptedBytes []byte) []byte {  
    kls := getKeyLengths(encryptedBytes)  
    var ret []byte 
    var retstre int 
    for kli := 0; kli < 5; kli++ {  
        kl := kls[kli].length  
        chunks := makeByteChunks(kl, encryptedBytes)   
        unencChunks := applyXorCipher(kl, chunks)  
        r := rebuildString(unencChunks)  
        var score int  
        for _, b := range r {   
            score += GetCharWeight(b)  
        }  
        if score > retstre {   
            ret = r   retstre = score  
        }  
    } 
    return ret
}

Eureka!

There you have it, you can now encrypt and decrypt things using repeating key XOR. It’s not exactly secure, but the concepts used here are going to become very important down the line. As always, you can check out my code on github, where there is actual test coverage and a nifty command line interface. You can also look ahead, it’s usually a couple challenges ahead of where the articles are.