Where the devil is in crypto - hands on bit flipping

Tags: #<Tag:0x00007f8a145f31d8> #<Tag:0x00007f8a145f3110> #<Tag:0x00007f8a145f3048> #<Tag:0x00007f8a145f2f80> #<Tag:0x00007f8a145f2eb8>

This wiki post (last content update mid 2017) focuses on practical analysis of cryptography for pentesters / red teamers etc… This isn’t an academic approach.

Rather than emphasizing math it about common implementation challenges.
Please verify the information independently. In case you spot an error, please send an eMail to marius - at - because-security.com.

Due to technical limitations, the rendering of the mathematical equations needs some seconds, even when using browsers.

If you see something like a [Math Processing Error], please reload the site and check your browser’s JavaScript console. I understand that controlling JavaScript via browser plugins is sometimes a sane idea.

Future topics:

  • Hash profiling
  • Cipher profiling
  • Hash length attacks
  • Cryptography Threat Models
  • Implementation-attacks (Markov-based Brute-Force attacks)
  • technical details about GPU-driven attacks
  • Rainbow-table generation
  • Key Management

Hands on. Brain on. Risks off.

I keep this short. The internet is a very handy way to communicate, but without proper cryptography a lot of the trust we inherently assume to be there can just go away.

The emphasis of this short wiki post is, to enable (code) reviews for crypto-focused projects, and to produce threat models and risk assessments with value. After the basics of crypto, the article focuses on popular attacks. Mathematical attacks are relevant, but they are by far less common, and they rarely become available to common attackers.

Symmetric, asymmetric, mixed - I thought this was more complex?

Symmetric cryptography procedures use the same key for encryption and decryption. If you share the same password (key) between the sender and the recipient, it’s probably symmetric cryptography.

In '78 Rivest, Shamir and Adleman pioneered asymmetric cryptography with RSA. RSA uses a public and a private key.
– Think of OpenSSH: in the .ssh user folder there is a file called id_rsa and a file called id_rsa.pub. The .pub is the public key, which you can share.

For Email encryption, PGP is very popular, where you also need to generate a public and a private key. The public key can be uploaded to one of the many PGP key servers, which are hosted by organizations such as the MIT. Modern services, like Keybase.io (as you can see I use this as well) encrypt the private key on a server symmetrically, with a user password. If you chose a reasonably long password, the private key cannot be decrypted without a lot of computation resources. Keybase.io is an example of a PGP based hybrid encryption service. This service hosts the private key to fix some usability problems, which are especially relevant to mobile devices. It is not as secure as pure PGP. This service is not actual hybrid cryptography though.

Asymmetric cryptography is slower than symmetric cryptography (in modern computer systems). Due to this, there is a compromise, which is actual hybrid cryptography. Many implementations change the key per transfer (per network packet), and encrypt the actual data symmetrically. A transfer gets its (hopefully) unique session key. The session key is exchanged with the recipient in an asymmetric way.

Diffie Helman - math is for... defenders

Diffie Helman is a standard, which was published in 1978. It’s used to exchange keys (for hybrid cryptography) between sender and receiver via a public channel. Since 1997 we know that James H. Ellis, Clifford Cocks and Malcolm J. Williamson of the British GCHQ pioneered key exchange methods via public channels, but they were unable to publish.

Let’s grab 2 prime numbers under 1000, N and G.

\begin{align} N & = 109\\\\ G & = 337 \end{align}

Obviously computers would work with larger prime numbers. Alice gets her own private value X_{\textrm{Alice}} = 11, and X_{\textrm{Bob}} = 22. These are confidential, in the asymmetric scheme.
We assume an attacker is eavesdropping and that the attacker has N and G. But we want to confidentially exchange a key. The attacker does not have the X keys.

For Alice:

\begin{align} Y_{\textrm{Alice}} & = G^{X_{\textrm{Alice}}} \mod N\\\\ & = 337^{11} \mod 109 \\\\ & = 72 \end{align}

The same for Bob:

\begin{align} Y_{\textrm{Bob}} & = G^{X_{\textrm{Bob}}} \mod N\\\\ & = 337^{22} \mod 109 \\\\ & = 61 \end{align}

Now these values get exchanged, Alice \overset{72} \longrightarrow Bob and Bob \overset{61} \longrightarrow Alice. That’s as straight forward as it gets.

What did Alice intend to use as a key? Math knows:

\begin{align} M_{\textrm{Alice}} &= Y_{\textrm{Bob}}^{~~~~~X_{\textrm{Alice}}} \mod N\\\\ & = 61^{11} \mod 109\\\\ & = 36 \end{align}

For Bob:

\begin{align} M_{\textrm{Bob}} &= Y_{\textrm{Alice}}^{~~~~~X_{\textrm{Bob}}} \mod N\\\\ & = 72^{22} \mod 109\\\\ & = 36 \end{align}

So, 36 is the key, which was exchanged. Which could be used for textbook RSA for example.

Summary

  • In asymmetric schemes there are public and private keys
    • Diffie Helman is a handy example to exchange a key asymmetrically
  • In cryptography modulo is a common operation, but generally larger prime numbers are used
  • We assume that there is eavesdropping and that someone will perform crypt-analysis

Stream ciphers

  • A Stream Cipher encrypts one bit at a time. The length of the ciphertext is the same as the plaintext.

  • Examples:

    • RC4 (TLS, WEP or WPA with TKIP, which is RC4 based)
    • A5/1, A5/2, (GSM)
    • E0 (Bluetooth)
  • A stream cipher generates a keystream, which often is XOR’ed with the plaintext to generate the ciphertext. So the keystream would often be of the same length as the ciphertext so that it can be XOR’ed.

Initialization vector and key rotation

In a Stream Cipher, you never use the same key twice. In a Stream Cipher implementation keystream data is XOR’ed with the ciphertext; or vice versa. The algorithms are deterministic and produce the same keystream for a given key. That’s the reason why keys should not get reused. That is the crux of Stream Ciphers.

It’s impractical to change the entire key for each individual transmission; to change it per packet or per frame. Therefore modern crypto implementations split the key into two portions: the Initialization Value (IV) and the Secret. The IV gets a static length and gets added to the transmitted ciphertext. It gets changed per unique transmission data set. The IV must never repeat if the Secret remains the same. Let’s say there is a 128 bit key, and 32 bits are reserved for the IV. That results in 2^{32} permutations until the Secret has to be changed.

  • If we make the IV very long, we will have more permutations, but a weaker key.
  • If we generate the IV sequentially (as it’s not a secret), system interruptions may lead to IV collisions. Either because the IV history got lost, or because of collisions.
  • if multiple devices use the same IV, the coordination needs to make sure, that the key is not re-used. Even within an IoT mesh for example.

Block ciphers

A Block Cipher generates data blocks., That means AES 256 will generate blocks of 256 bits each. If the message is shorter, the block will be filled up. This is called Padding.

A Block Cipher should not allow a cryptographer to determine the message length since unlike Stream Ciphers the Block Ciphers do not allow assertions on ciphertext length. Blocks will always be of the same size.

  • Examples:
    • AES
    • DES
    • 3DES
    • A5/3
    • Carmellia
    • Blowfish / Twofish

Block Modes Modes

The Block Cipher shortcuts are popular because every System Administrator sees them in VPNs, Load Balancers or even Web Server configs. What’s the difference between AES-CTR and AES-CBC?!

  • CTR stands for Counter
  • ECB stands for Electronic Codebook
  • CBC stands for Cipher Block Chaining

ECB is not the best option

Some modes are better than others. AES has weak modes, like ECB:

AES-ECB will generate the same ciphertext each time. Usually you want ciphertext to be indistinguishable from randomness (pseudo-randomness).

Other Block Cipher modes offer uniqueness to each encryption operation.

Thus: if you see an implementation, that uses ECB, you should reflect this within the risk assessment: what does this mean for the application / project?

image
(Source: Cryptography StackExchange)

References

{1} Demo: How to get the ECB Penguin with OpenSSL

Cipher Block Chaining

image
(Source: Cyprography StackExchange)

  • The \oplus here stands for XOR.
  • The IV gets XOR’ed with the Plaintext Block. That’s just to get the motor started, like on the old cars which needed a hand crank.
    • From there on a domino-like chain of re-using the prior Ciphertext Blocks continues, and the key is used for the encryption

Now… where do we get the IV for the first block from? Yes, the IV should not repeat: we know that from Stream Ciphers.

A repeating IV can be used as an attack vector in crypt-analysis. Assuming the key and IV remains the same, and the resulting block will be the same. CBC implementations with a static key and IV are weak!

Let's do it wrong with AES-CBC!

Check this out on a Linux command-line with OpenSSL. Let’s do everything wrong.

~ » key="123"                                                                                       
~ » iv="321"    

~ » echo "secret" > packet1
~ » echo "secret1" > packet2
~ » echo "secret" > packet3

~ » openssl enc -aes-192-cbc -in packet1 -K $key -iv $iv | xxd -Eu                                 
00000000: 3386 ac8b c8f4 cce6 ace9 a2c9 44a8 130f  .f..H4.W.ZsI.y..

~ » openssl enc -aes-192-cbc -in packet2 -K $key -iv $iv | xxd -Eu                                  
00000000: 445d 7e88 9d83 d4bd 2eae f3e3 8cb3 9dc0  .)=h.cM]..3T...{

~ » openssl enc -aes-192-cbc -in packet3 -K $key -iv $iv | xxd -Eu                               
00000000: 3386 ac8b c8f4 cce6 ace9 a2c9 44a8 130f  .f..H4.W.ZsI.y..

The openssl command here reads the files packet1, packet2 and packet3. The variables iv and key remain the same for each invocation. The result is that packet1 and packet3 can be correlated to be the same, due to the static IV and key. This is visible here in both ASCII and Hex.

Result: use a unique IV per encryption. Even if it’s a sequential change, it gets much harder to perform crypt-analysis.

CTR (Counter)

CTR mode shares a Stream Cipher characteristic. In opposite to CBC mode, it does not feed the previous block’s ciphertext into the next block.

image
(Source: Cryptography StackExchange)

The Nounce is the IV, which is concatenated with the fixed-length Counter. The IV isn’t as long as the block. The fixed-length counter gets added. The keystream output then is XOR’ed with the Plaintext input to generate the ciphertext.

  • with CTR mode the encrypting host can encrypt all blocks in parallel, concurrently in multiple operations. CBC is sequential because it always needs the prior block.
  • CTR mode prevents the ECB-effect by using unique IV and Counter combinations.

But CTR works a little like a Stream cipher because the plaintext and keystream data are getting XOR’ed. That means the crux of Stream Ciphers applies: the keys should not get reused with the same IV:

References

{1} Dangers of using CTR mode

Padding

Blocks Ciphers need to generate blocks of even length. In order to do this, they make use of Padding to fill up the missing bytes.

  • Commonly there are two Padding algorithms: PKCS#5 (for 8 byte blocks) and PKCS#7 (for 8 and 16 byte blocks).

image
(Source: blog.gdssecurity)

If there are 5 bytes left, it’s filled up with 0x05, for 6 0x06 and so on. If there are 0 bytes left, another block is added with padding values only, 0x08.

How do you identify an encryption algorithm in crypt-analysis?

There heuristics to do this:

  • the data size cannot be divided by 8 (evenly): probably RC4
  • the data size can be divided through 16: probably AES 128 (128 : 16 = 8), 16 byte block length
  • sometimes by 16, always by 8: (3)DES (with 8 byte block length - 64 bit)

Target systems will have a documentation (spec sheet, used components) or filings with the FCC or European national equivalents.

How do you identify a hashing algorithm in crypt-analysis>

Hashes have different lengths as well.

You can use a tool called hashID, which will list what hash type you may be looking at.

Okay, how do you identify whether it's encrypted at all?

Certain binary-type or proprietary protocols may look like encryption but aren’t.

In order to make an estimation here, whether it’s encrypted or not, entropy analysis is a quick initial start-point. Different alphabets (like ASCII) have an encoding entropy.

This can be done with a tool, called pcaphistogram.pl.

Properly encrypted data cannot be distinguished from randomness:

image
(Source Packetstan)

Plaintext data, due to the alphabet being used, shows different characteristics:

image
(Source Packetstan)

pcaphistogram iterates over each byte of the payload, and uses GNUplot.

  • if the byte frequency at the values around 64 bytes is high, this indicates ASCII cleartext
  • if the distribution is even, this indicates proper encryption

Estimations upon randomness

For a session-specific analysis, rather than a per-byte observation, the tcpick or Scapy in combination with GNU ent may be applied. GNU Ent was developed by John Walker, who founded Autodesk afaik.

ent can be used to evaluate pseudo-random number generators (like encryption algorithms). The GNU Crypto project may also have a tool to quantify randomness (better than ENT as they say).

  • Ent applies statistical analysis to identify entropy patterns in files. Entropy is scored. It should be close to 8. If it’s slower, the observation is that the encryption may be weak. It may just be obfuscation!
  • Ent works with the Payload. If you perform network analysis, the rest (Frame and Ethernet headers) need to be stripped. There must to be patterns anyways unless you are looking at low-level MAC layer encryption technology.

Summary: AES - checkbox - done = dumb

  • AES (Rijndael) is just a block cipher and needs to be used with the right mode to be efficient
    • think twice, before you greenlight ECB mode. It’s your job to assess whether it’s appropriate.
  • IVs should always be unique per operation, but the length is limited which means that the IV space can be enumerated
  • Don’t reuse a key, when you use a Stream Cipher
  • Encryption algorithms are just like PRNGs, like Pseudo-Random-Number-Generators. Their entropy can be measured, and visually estimated to avoid hardcore mathematics
  • Encryption is used in context with a protocol (like Ethernet RFC 894 etc.). Due to this, the randomness is limited to the payload. Traffic Analysis is still possible.
  • The encryption or hash algorithm can be narrowed down with heuristic patterns

Implementation attacks

CBC bit flips

CBC bit-flip attacks are possible if you can specify an encrypted input. You’d poke it, and replace select bytes.

Padding Oracle and POODLE

The preferences of a Padding Oracle attack are that

  • we have the IV (c_0),
  • the ciphertext (c_1),
  • and that we can heuristically determine the encryption; for example via the blocksize or other observations upon the randomness.

In literature about the Padding Oracle attack scheme you often read c_0c_1, which indicates that the IV is transmitted before the actual ciphertext in a concatenated fashion. Some people assume that this indicates a multiplication, in a Linear Algebra style notation where the * is omitted sometimes. Not here.

In most cases, Padding Oracle attacks are successful against CBC mode implementations, when the ciphertext is XOR’ed in a Stream Cipher fashion with the IV.

image
(Source: Rob Heaton)

  1. You can send an IV (c_0) of 0 's to the potentially vulnerable server. Then c_1.
  2. The app server will attempt decryption, and it will keep the keystream data confidential.
    2.1 But it will XOR the keystream data with c_0
    2.2 This results in invalid plaintext
  3. The last byte of the plaintext will be checked for the Padding Byte
    3.1. Given that this is most likely going to be invalid, the server will respond with an error code
    3.2 As the attacker we can know, that the last byte was invalid due to this
    3.3. As the attacker, we can continue to modify c_0 in a brute force fashion.
    3.4 Instead of 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 c_0 becomes 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01. The process is repeated until the server does not respond with an error code, that indicates invalid padding.
  4. We want to identify the keystream byte, once we have a valid IV. CBC behaves like a Stream Cipher, where the IV and plaintext are XOR’ed.
    4.1 We XOR the keystream byte with the valid IV byte. The common formula is m_i = x_{i-1} \oplus c_{i-1}. m_i is the message (the plaintext) at position i.
    4.2 Since we have recovered 1 byte of the plaintext, we know the padding size now, and the IV.
    4.3. Note that due to CBC, the prior ciphertext gets used to continue the attack.

In simple words: the Padding Oracle attack scheme abuses Padding as an attack vector to decrypt the messages and causes a chain-decryption due to CBC mode’s inherent reliance on the prior block.

Reference

{1} Cryptography StackExchange on Padding Oracle attacks

{2} The Padding Oracle Attack by Rob Heaton

Changelog

Q3 2017 - around that time this content was collected (state)
2021-05-06T22:00:00Z - published under public-wiki-drafts, typos fixed