Network Security slide set

## Security through obscurity
- The use of secrecy to hide the cryptographic system being used
  - Contrast this with algorithms such as RSA, which are publically analyzed
- The problems is that somebody will figure out how it works ("many eyes make all bugs shallow" - Linus Torvalds)
  - And then, if there is a flaw (due to lack of peer review), the system is vulnerable
- [Reference](http://en.wikipedia.org/wiki/Security_by_obscurity)

<h2 class="r-fit-text">Block-ciphers vs. stream ciphers</h2>

- Block ciphers require a block of text (perhaps 1 Kb, for example)
  - [Reference](http://en.wikipedia.org/wiki/Block_cipher)
- Stream ciphers encrypt data as it is provided, character-by-character
  - I prefer the name 'character cipher' over 'stream cipher'
  - [Reference](http://en.wikipedia.org/wiki/Stream_cipher)

## Cipher Taxonomy
![cipher taxonomy](images/encryption/cipher-taxonomy.webp)
- [Reference](http://en.wikipedia.org/wiki/Ciphers)

## One-time pad (OTP)
- A substitution cipher
- Take a *random* string that is as long as the plain text you want encrypt
  - Use modular arithmetic (or XOR, or Vigenere) to determine the encrypted version
  - Plain text:    helloworld
  - One-time pad: zdxwhtsvtv
  - Encrypted:    hijiwqhnfz
- [Reference](http://en.wikipedia.org/wiki/One-time_pad)

## One-time pad (OTP) analysis
- Pros
  - Proven to be perfectly secure if:
    - the pad is truly random
    - the pad is only used once
    - the pad is kept secret
  - This, it is the ONLY cryptosystem with perfect secrecy
  - It can be performed by hand
- Cons:
  - Good for short messages; it's hard to transport large pads (i.e. network communication)
  - Does not provide message authentication
  - How do you get the pad to the recipient?

Re-using a one-time pad

Use an OTP:	⊕ =
Re-use the same OTP:	⊕ =
Extract the images:	⊕ =

This example from StackExchange

## Secret key cryptography
- The function and/or key to encrypt/decrypt is a secret
 - (Hopefully) only known to the sender and recipient
- The same key encrypts and decrypts
- How do you get the key to the recipient?
- Also called symmetric cryptography
  - As it's the same key that both encrypts and decrypts
- Example algorithms: AES, DES

## Public key cryptography
- Everybody has a key that encrypts and a separate key that decrypts
 - They are not interchangable!
- The encryption key is made public
- The decryption key is kept private
- Example algorithms: RSA, ECDSA

## Public key cryptography goals
- Key generation should be relatively easy
- Encryption should be easy (polynomial time)
- Decryption should be easy (polynomial time)
  - With the right key!
- Cracking should be very hard (exponential time)
- Mathematical concepts need to used where the operations to do the first three are "easy" and the operation to perform the last one is "hard"

## Elliptic curves

## Elliptic curves
<img src='images/encryption/secp256k1-e-4points-line.svg' class='stretch'>

## Hard versus Easy Math
- Integer factorization
  - Generating a prime number is (relatively) easy
  - Factoring an integer, when there are not small prime factors, is hard
- Discrete logs
  - Computing $x = a^b\text{ mod }c$ is (relatively) easy
  - Computing $a$ when given $x$, $b$, and $c$, and you know $x = a^b\text{ mod }c$, is hard
- Elliptic curves
  - Adding points together is (relatively) easy
    - Repeated efficient adding yields multiplication of a point by a scalar: $P=d \otimes G$
  - Computing $d$ when you are given $G$ and $P$ and you know $P=d \otimes G$, is hard

## Review: RSA
- Uses prime numbers, integer factorization, and discrete logs
- Key generation:
 - Generate two large primes $p$ and $q$ and compute $n=p \ast q$
 - Choose $1 < e < n$ which is relatively prime to $(p-1)(q-1)$ (65537 is almost always chosen)
 - Compute $d$ such that $d * e \equiv 1 (mod (p-1)(q-1))$
 - That's a discrete log
- Encryption:
 - Put into blocks, padding as necessary
 - Encryption formula is $c = m^e\text{ mod }n$
- Decryption
 - Decryption formula is $m = c^d\text{ mod }n$

<h2><a href="http://xkcd.com/538">Security</a></h2>
<img class="stretch" src="http://imgs.xkcd.com/comics/security.png" title="Actual actual reality: nobody cares about his secrets. (Also, I would be hard-pressed to find that wrench for $5.)" alt="Security">

## Review: Signatures
[![](https://upload.wikimedia.org/wikipedia/commons/d/d5/JohnHancocksSignature.svg)](https://commons.wikimedia.org/wiki/File:JohnHancocksSignature.svg)
- A signature is not to hide (encrypt) the message
 - Instead, it's to assure someone that the message is what you say it is
- To "sign" a message:
 1. Write a message, and determine the SHA-256 (or similar) hash
 2. Encrypt the hash with your private (encryption) key
 3. Anybody can verify that you created the message because ONLY the public (encryption) key can decrypt the hash
 4. The hash is then verified against the message
- Formally: given message $m$, hash function $h()$, public key encryption $e_{pub}()$ and $e_{pri}()$, the signature is $e_{pri}(h(m))$

## Review: Randomness: LCG
[![](https://upload.wikimedia.org/wikipedia/commons/1/1c/6sided_dice_%28cropped%29.jpg)](https://commons.wikimedia.org/wiki/File:6sided_dice_(cropped).jpg)
- [Linear congruential generator (LCG)](https://en.wikipedia.org/wiki/Linear_congruential_generator)
- $X_{n+1}=(a*X_n+c) \text{ mod } m$
 - $m$ is the modulus; must be positive
 - $a$ is the multiplier: $0 < a < m$
 - $c$ is the increment: $0 < c < m$
 - $X_0$ is the seed value
- Let $m=9$, $a=4$, $c=7$
- We'll arbitrarily decide to start the sequence at 1: $X_0=1$
- $X_{n+1}=(a*X_n+c) \text{ mod } m$
 - $X_0 = 1$
 - $X_1 = (4*1+7) \text{ mod } 9 = 2$
 - $X_2 = (4*2+7) \text{ mod } 9 = 6$
 - Rest of the sequence: 4, 5, 0, 7, 8, 3, and then back to 1

## Review: Randomness
[![](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d0/Black_Polyhedral_Game_Dice.jpg/1024px-Black_Polyhedral_Game_Dice.jpg)](https://commons.wikimedia.org/wiki/File:Black_Polyhedral_Game_Dice.jpg)
- To create secure keys, we need a good random number generator
 - Cryptographically Secure Pseudo Random Number Generator (CSPRNG)
- It requires:
 - A good algorithm
 - A good source of randomness (entropy): environmental sensors, timings, etc.

## Key Exchange
[![](https://upload.wikimedia.org/wikipedia/commons/thumb/9/98/Keys_%284602593426%29.jpg/1024px-Keys_%284602593426%29.jpg)](https://commons.wikimedia.org/wiki/File:Keys_(4602593426).jpg)
- If you want to communicate with another party, how do you create a private key?
 - One solution: Key stores, but only if you both have generated public/private key sets
 - Not always practical for all security protocols
- Another solution: [Diffie–Hellman key exchange](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange)
 - This version uses discrete logs
 - A variant called [ECDH](https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman) uses elliptic curves

## [Diffie Hellman key exchange](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange)
[![](https://upload.wikimedia.org/wikipedia/commons/8/88/Diffie_and_Hellman.jpg)](https://commons.wikimedia.org/wiki/File:Diffie_and_Hellman.jpg)
- Alice and Bob agree on public values *g* (generator) and *p* (prime or modulus)
- Alice picks a private *a* and sends to Bob $A=g^a \mod p$
- Bob picks a private *b* and sends to Alice $B=g^b \mod p$
- Because $g^{ab} \mod p = g^{ba} \mod p$, both can compute the secret key
 - Alice received $g^b \mod p$ and computes $(g^b \mod p)^a \mod p = g^{ab} \mod p$
 - Similarly for Bob

## DH key exchange example
From [Wikipedia](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange#Cryptographic_explanation); blue are public, red are secret
<ol style="font-size:80%;line-height:110%;">
<li>Alice and Bob agree to use a modulus p = 23 and base g = 5 (which is a primitive root modulo 23).</li>
<li>Alice chooses a secret integer a = 4, then sends Bob A = ga mod p
<ul><li>A = 54 mod 23 = 4</li></ul></li>
<li>Bob chooses a secret integer b = 3, then sends Alice B = gb mod p
<ul><li>B = 53 mod 23 = 10</li></ul></li>
<li>Alice computes s = Ba mod p
<ul><li>s = 104 mod 23 = 18</li></ul></li>
<li>Bob computes s = Ab mod p
<ul><li>s = 43 mod 23 = 18</li></ul></li>
<li>Alice and Bob now share a secret (the number 18)</li>
</ol>

[![DH key exchange as colors](images/encryption/Diffie-Hellman_Key_Exchange.svg)](https://commons.wikimedia.org/wiki/File:Diffie-Hellman_Key_Exchange.svg)
## DH key exchange example as colors

## Colors in video (at 2:43)
<iframe width="720" height="540" src="https://www.youtube.com/embed/YEBfamv-_do?t=163&rel=0" frameborder="0" allowfullscreen></iframe>

If that doesn't work, try the [direct Youtube link](http://youtu.be/YEBfamv-_do?t=163), but start at 2:43

<h2 class="r-fit-text">DH key exchange in practical use</h2>

- Used in many cryptographic protocols
  - Including ones we will see shortly
- Uses *very* large numbers created with a very good pseudo-random number generator

## MITM versus DH key exchange?
- Possible if Eve actively controls the communication medium
- If s/he is only a *passive* listener, then it is not possible
- How to avoid an active compromiser:
  - Post on a public forum
  - Use communication methods s/he can't compromise, such as shortwave radio
- Recap:
  - It is secure against an *eavesdropper*
    - Even if transmitted in plaintext
  - But *not* against somebody who can modify the messages

## Forward Secrecy (FS)
[![](https://upload.wikimedia.org/wikipedia/commons/4/43/Das_Geheimnis_-_Le_secret.jpg)](https://commons.wikimedia.org/wiki/File:Das_Geheimnis_-_Le_secret.jpg)
- Also known as perfect forward secrecy (PFS)
- The concept that a future compromise of keys...
 - ... including the server's private keys...
- Will *not* allow decryption of past sessions
- Solution: generate a unique *session key* for each encrypted connection
 - And discard once done
- Communicating the session key via public key cryptography is *not* forward secret
- But DHE, even in plaintext, is

## Definition
- A certificate is:
  - A public key
  - Information about the key (algorithm used, version number, etc.)
  - An identity (hostname, individual, etc.)
  - Other meta-data (expiration date, etc.)
  - A *signature* from a *certificate authority*
- Most common format is [X.509](https://en.wikipedia.org/wiki/X.509)

## Diagram of an X.509 cert
![x.509 certificate](images/encryption/x.509-cert.svg)

## Certificate usage
[![https certificate](https://upload.wikimedia.org/wikipedia/commons/4/4b/ServerCertificate.png)](https://commons.wikimedia.org/wiki/File:ServerCertificate.png)
- All websites that support https provide such a certificate to any client
- Browsers can display certificate information, such as shown here
- We can view one via our web browser, such as [www.cs.virginia.edu](https://www.cs.virginia.edu/computing)

## Certificate validity
- All certificates are *signed* with the (private) key of a certificate authority (CA)
  - Their public keys are well known, and are hard-coded into browsers
  - Or determined by key exchange protocols discussed in this slide set
- One can verify such a certificate by the (known) public key of the CA
- One can "self sign" a certificate, but modern browsers typically won't allow viewing without manual exceptions

## Certificate Chains

An entity (such as virginia.edu) can hold an *intermediate* certificate, which allows it to sign for entities in its division (such as news.virginia.edu)

## Certificate Authorities (CAs)
- By definition, they have a so-called *root* certificate
  - Although it may not be trusted by anybody else
- Any CA can "vouch" (i.e., sign) for *any* certificate
  - For any website or any (likely new) CA
- Thus, the security is based on the weakest CA
- What happens if a CA is compromised or goes rogue?

## CA Revocation
- This has happened!  In February 2018.
  - [Google to kill Symantec certs in Chrome 66, due in early 2018](https://www.theregister.co.uk/2017/09/12/chrome_66_to_reject_symantec_certs/)
  - Firefox also: [Distrust of Symantec TLS Certificates](https://blog.mozilla.org/security/2018/03/12/distrust-symantec-tls-certificates/)
- The reason: a site called [Trustico](https://www.trustico.com/) resold Symantec-signed certificates
  - The 23,000 certificates Trustico sold were "compromised"
    - They still had the private keys (a big no-no)
  - They were sold under the umbrella of Symantec, so Symantec was party at fault as well

## Trusting Trust
- A server's certificate is signed by a Certificate Authority
  - Perhaps through an intermediary certificate
- So we trust the server's certificate because the CA vouches for it
- And we trust the CA because... everybody else does?
- At some point, we have to trust that the system is well designed and that it works

<h2 class="r-fit-text">Root CA hack: DigiNotar (Sept 3, 2011)</h2>

- Dutch certificate authority [DigiNotar](https://en.wikipedia.org/wiki/DigiNotar) suffered a security breach
- Result: ~500 fraudulent certificates issued
- Target: 300k Iranian gmail users
  - Gmail users then compromised via MITM attack
- Whodunit?
  - Dutch government's investigation stated the Iranian gov't was behind it
  - [Bruce Schneier](https://en.wikipedia.org/wiki/Bruce_Schneier) claims it was  "either the work of the NSA, or exploited by the NSA" ([ref](https://www.schneier.com/blog/archives/2013/09/new_nsa_leak_sh.html)); others dispute this
- Result: all DigiNotar certs were blacklisted; company closed a week later

## Root CA Abuse: CNNIC (Apr 2015)
- [China Internet Network Information Center (CNNIC (sic))](https://en.wikipedia.org/wiki/China_Internet_Network_Information_Center) handles domain registry and certificates in China
  - Applied for a trusted root cert to Mozilla in 2009; later to Microsoft
- In 2015, an intermediate CA issued by CNNIC was found to have issued fraudulent certs for Google domains
  - An Egypt-based firm used it to impersonate some Google domains
- Google revoked CNNIC's root certificate
- April 2015: Google & Mozilla announced they were no longer trusting CNNIC's root cert

## Root CA Misuse: WoSign and StartCom (2016/2017)
- WoSign was (is?) China's largest certificaute issuer, owned by [Qihao 360](https://en.wikipedia.org/wiki/Qihoo_360)
  - [StartCom](https://en.wikipedia.org/wiki/StartCom), an Israeli certificate authority, was acquired in secret by WoSign
- WoSign and SmartCom issued "hundreds of certificates with the same serial number in just five days"
  - Including one for GitHub
- Chrome and Mozilla revoked their root certificate recognition in 2017

## Trusted root cert count
- For [Apple devices](https://support.apple.com/en-us/105116): 159 as of Feb 21, 2024
- For [Mozilla / Firefox](https://wiki.mozilla.org/CA/Included_Certificates): 170 as of Feb 21, 2024
- For [Microsoft](https://ccadb.my.salesforce-sites.com/microsoft/IncludedCACertificateReportForMSFT): 487 as of Feb 21, 2024
- Other platforms (Chrome, Android, etc.) only seem to list how to find it on the browser (Chrome) or device (Android)

## SSL & TLS
- They are cryptographic protocols used to encrypt network communication
- SSL: Secure Sockets Layer
  - v1.0 never released due to flaws
  - v2.0 released in 1995; broken and not used
  - v3.0 released in 1996; broken and not used
- TLS: Transport Layer Security, the successor to SSL
  - v1.0 released Jan 1999; currently deprecated
  - v1.1 released Apr 2006; currently deprecated
  - v1.2 released Aug 2008; in widespread use
  - v1.3 released Aug 2018; currently being adopted

## TLS Usage
- TLS can be implemented on top of just about any protocol
- Most common use: when implemented on top of HTTP, that yields HTTPS
- Also used for encrypting mail (SMTP)
- A programmer can use it for any other purpose

## Public vs. Symmetric Cryptography
- Public key cryptosystems (such as RSA) are:
  - Very secure: one message will take all the computers in the world longer than the life of the universe to crack
  - Very slow: not feasible for real-time data transmission (downloading a large file, streaming, etc.)
- They are typically used for key exchange, and then symmetric key cryptography is used for the rest of the communication

## TLS 1.2 Handshake, part 1
- Client opens a network connection to server
- Client sends *hello*, including a random number $R_c$ and supported ciphers
- Server responds with *hello*, including a random number $R_s$, selection of which client-supported cipher to use, its certificate, and requests client certificate
- Client checks server certificate, and sends its own cert, and a signature of all previous messages sent
  - Signed with the client certificate's private key
- Server checks client certificate and signature

## TLS 1.2 Handshake, part 2
- Client generates $PMS$ (Pre-Master Secret) -- a session key -- and sends it to server, encrypted with server's public key
- Both sides compute $MS$ (Master Secret) from $PMS$, $R_s$, and $R_c$
- Client send a message to switch over to using the $MS$ for symmetric encryption
  - Server confirms switching over to using the $MS$ for encryption
- Both sides terminate the SSL handshake, and use $MS$ as the symmetric encryption key

source, bigger

## TLS 1.2 pcap
- See [https-tls-1.2.pcap](pcaps/https-tls-1.2.pcap)
- It was a request to [https://www.cs.virginia.edu/ computing](https://www.cs.virginia.edu/computing)
- Packets:
    - Client Hello is packet 8
    - Server Hello is packet 10
    - Server certificates in packet 12
    - Client certs and change to AES in packet 14
    - Server switch to AES is packet 15
    - HTTPS GET in packet 16

## Is TLS 1.2 forward-secret?

## Is TLS 1.2 forward-secret?
- No, it is not (generally)
- Depending on the ciphers used, if the keys are later made available, then it is *NOT* forward-secret
- Let's review the [TLS 1.2 protocol](https://upload.wikimedia.org/wikipedia/commons/a/ae/SSL_handshake_with_two_way_authentication_with_certificates.svg)
- What would make it forward-secret?
  - Adding in Diffe-Hellman Key Exchange (DHE)
- *Some* crypto protocols used are FS, but not all

## TLS 1.3
- Cryptographic improvement:
  - Only allow public encryption protocols that are forward-secret (meaning: they use DHE)
- Other improvements:
  - Send more of the information as a single message, so there is less back-and-forth to the handshake
- Speed improvements:
  - Requires less messages

<h2>TLS 1.2 versus TLS 1.3</h2>
<a href='images/graphs/tls-1.2.webp'><img src="images/graphs/tls-1.2.webp" style="float:left;width:300px;margin-right:60px"></a>
<a href='images/graphs/tls-1.3.webp'><img src="images/graphs/tls-1.3.webp" style="float:right;width:300px;margin-left:40px"></a>

- Fewer msgs
- Speed diff: perhaps 200 ms instead of 300 ms

## TLS 1.3 pcap
- See [https-tls-1.3.pcap](pcaps/https-tls-1.3.pcap)
- It was a request to [https://google.com/ computing](https://google.com/)
- Packets:
    - Client Hello is packet 4
    - Server Hello is packet 6
    - "Finished" and HTTPS GET in packet 8
    - HTTPS response starts in packet 10

## TLS Ciphers Supported
- In version 1.3, for key exchange, only forward-secret public key systems are supported
  - Various public key cryptosystems, but all with a secure DH key exchange
  - Version 1.2 allows many other algorithms
- Once the channel is configured, multiple algorithms are supported, the most common being AES

## [TLS key exchange protocols](https://en.wikipedia.org/wiki/Transport_Layer_Security#Key_exchange_or_key_agreement)

## [TLS session protocols](https://en.wikipedia.org/wiki/Transport_Layer_Security#Key_exchange_or_key_agreement)

<h2 class="r-fit-text">Message authentication code (MAC)</h2>

- It's "a short piece of information used to authenticate a message" ([Wikipedia](https://en.wikipedia.org/wiki/Message_authentication_code))
- Like a digital signature, but:
  - uses a shared *secret* key, so it is faster
- Like a hash, but:
  - provides authentication, not just a message digest
  - includes the secret key, so two different connections' MAC of the same text will be different due to different secret keys
- We'll see these in detail later in this slide set

## MAC uses
- Prevents replay attacks
  - A MAC includes a signature (hash, etc.), and combined with a sequence number, will prevent replaying an encrypted packet
  - But why does the sequence number alone not work?
  - Replay attacks may not be possible, depending on the IV method used
- Used for non-encrypted verification
  - To verify a large downloaded ISO file was correct

## A possible MAC
- Given a message $m$ and a secret key $k$:
  - Create a new value $x = m \cdot k$
	- That's string concatenation
  - Hash that message: $h = hash(x)$
	- Use a good hash function!
  - Encrypt that hash with $k$: $mac = encrypt(h,k)$
    - Likely AES (or similar) with the shared secret key
- In one line: $mac = encrypt(hash(m \cdot k),k)$

## MAC properties
- Does not encrypt the message!
  - If the connection is not encrypted, then the message is eavesdroppable
- Cannot be forged without the secret key
  - Any eavesdropper must know the secret key in order to create a valid MAC
- It is "unforgeable under chosen-message attacks"
  - Even if Eve can get Bob to send chosen text attacks, she needs the (unknown to her) secret key to generate a valid MAC
- Provides authentication (since only the other party has the secret key)

## Authenticated Encryption with Associated Data (AEAD)
- MACs do not *encrypt* the data, and we want to do so
- If we combine encryption and a MAC, we get an AEAD
  - Assuming a few properties about the quality of the encryption and MAC function
- The "associated data" is non-encrypted data
- Usage: network packets
  - The payload is encrypted (via authenticated encryption)
  - The packet headers, which cannot always be encrypted, are the "associated data"

## [TLS MAC protocols](https://en.wikipedia.org/wiki/Transport_Layer_Security#Key_exchange_or_key_agreement)
![TLS MAC protocols](images/encryption/tls-protocols-3.webp)
Note that even if the MAC doesn't support encryption (in the first 3 rows), the cipher (likely AES) still does

<h2 class="r-fit-text">Data Encryption Standard (<a href='http://en.wikipedia.org/wiki/Data_Encryption_Standard'>DES</a>)</h2>

- Adopted in 1976, it's a private-key encryption/decryption block cipher
- Briefly, it does a lot of (invertible) bit-shifting in rounds to encrypt/decrypt a message
- How to crack?
  - It is susceptible to brute force attacks ($2^{56} = 7 * 10^{16}$ keys)
- Solution: use DES three times => "Triple DES"
  - Use a 56*3 = 168 bit key, and encrypt the message three times, once with each key

<h2 class="r-fit-text">Cracking Triple DES encryption</h2>

- Brute-force attacks:
  - $2^{168}$ keys, but due to various mathematical properties, this ends up being $2^{112}$ different keys
- "The best attack known on 3-key TDES requires around $2^{32}$ known plaintexts, $2^{113}$ steps, $2^{90}$ single DES encryptions, and $2^{88}$ memory"
- NIST (National Institute of Standards and Technology) considers it secure through 2030

## DES conspiracy theories
- NSA was involved with DES' creation
  - They convinced IBM to lower the key length from 128 to 64, and then to 56
  - And kept many of the details secret
  - Many well-respected people criticized the NSA for "improper interference" with the algorithm
- In 1977, Diffie and Hellman (major names in cryptography) proposed a $20 million machine that could crack a DES message in a single day
- It's known that the NSA had the budget for such a machine.  But did they build it?

<h2 class="r-fit-text">Advanced Encryption Standard (AES)</h2>

- The successor to DES
- Has three possible key lengths: 128, 192, and 256
- NSA approved this standard, and kept the process open
- Like DES, it's a series of (invertible) bit-shifting in rounds to encrypt/decrypt a message
- Many worry about the security of the standard
  - ... that somebody may figure a way to crack it mathematically, in particular
- [Reference](http://en.wikipedia.org/wiki/Advanced_Encryption_Standard)
- More details in the next slide column

## [Confusion and Diffusion](https://en.wikipedia.org/wiki/Confusion_and_diffusion)
- Two properties of ciphers (Shannon, 1945)
- *Confusion* hides the relationship between the plaintext and the ciphertext
  - Ideally it makes the ciphertext appear random or uniform
  - Confusion only ciphers: OTP (high confusion), Caesar cipher (low confusion)
- *Diffusion* means a small change in the plaintext has a large change in the ciphertext
  - Changing one bit in the plaintext should change about half the bits in the cipher text, and visa-versa
  - Diffusion only ciphers: transposition cipher
- Any "reasonable" block cipher has both

## Confusion and diffusion
- One time pad, stream ciphers: high confusion, no diffusion
  - High confusion: the pad makes it impossible to crack the plaintext
  - No diffusion: Changing one bit in the plaintext changes only one bit in the ciphertext
- Transposition cipher: no confusion, moderate diffusion
  - No confusion: the bytes are the same in the plaintext and ciphertext, just out of order
  - Moderate diffusion: this cipher moves things around
- Caesar cipher: moderate confusion, no diffusion
  - Moderate confusion: each letter is obscured
  - No diffusion: a change to one letter in the plaintext changes only one letter in the ciphertext

## Initialization Vectors (IVs)

[![AES-CBC encryption](https://upload.wikimedia.org/wikipedia/commons/8/80/CBC_encryption.svg)](https://en.wikipedia.org/wiki/File:CBC_encryption.svg)
- Each successive encrypted block depends on the previous one, so that two encrypted blocks that are the same plaintext will have different ciphertext
- IVs are typically random, and sometimes transmitted in plaintext
- But how to incorporate that into the encryption?  So many ways...

## Symmetric ECB encryption
[![AES-ECB encryption](https://upload.wikimedia.org/wikipedia/commons/d/d6/ECB_encryption.svg)](https://commons.wikimedia.org/wiki/File:ECB_encryption.svg)
- ECB = electronic codebook
- Formally: $c_x = AES_e(m_x)$
- With no initialization vector, the same plaintext encrypts the same each time
  - No diffusion!  And susceptible to replay attacks.

## ECB Encryption Issues

- With small block sizes, and large areas of uniform color:
  - The original image (left)
  - Can be viewed as the middle image
  - Whereas other modes make it look more random like the right image

## Cipher block chaining (CBC) mode encryption
[![AES-CBC encryption](https://upload.wikimedia.org/wikipedia/commons/8/80/CBC_encryption.svg)](https://en.wikipedia.org/wiki/File:CBC_encryption.svg)
- CBC = Cipher Block Chaining
- Formally:
  - $c_0 = AES_e(IV \oplus m_0)$
  - $c_1 = AES_e(c_0 \oplus m_1)$
  - $c_2 = AES_e(c_1 \oplus m_2)$

## Cipher block chaining (CBC) mode decryption
[![AES-CBC decryption](https://upload.wikimedia.org/wikipedia/commons/2/2a/CBC_decryption.svg)](https://en.wikipedia.org/wiki/File:CBC_decryption.svg)
- CBC = Cipher Block Chaining
- Formally:
  - $m_0 = IV \oplus AES_d(c_0)$
  - $m_1 = c_0 \oplus AES_d(c_1)$
  - $m_2 = c_1 \oplus AES_d(c_2)$

## Cipher block chaining (CBC) mode details
- Encryption cannot be parallelized
  - Each block depends on the previous one
- But decryption can
  - As each block only depends on the previous one
- Provides only confidentiality, not integrity or authenticity
  - Confidentiality: because it's encrypted
  - No integrity: there is no hash or checksum
  - No authenticity: a replay attack could occur
    - The plaintext would be scrambled (different counter) but that might not make it "invalid"

## Counter (CTR) mode encryption
[![AES-CTR encryption](https://upload.wikimedia.org/wikipedia/commons/4/4d/CTR_encryption_2.svg)](https://en.wikipedia.org/wiki/File:CTR_encryption_2.svg)
- CTR = Counter
- Nonce is the same as the IV
- Formally: $c_x = AES_e(IV \cdot counter) \oplus m_x$
- This is now essentially a *stream cipher*
  - As the encryption is on the *key*, and the result is xor'ed with the plaintext

## Counter (CTR) mode decryption
[![AES-CTR encryption](https://upload.wikimedia.org/wikipedia/commons/3/3c/CTR_decryption_2.svg)](https://en.wikipedia.org/wiki/File:CTR_decryption_2.svg)
- CTR = Counter
- Nonce is the same as the IV
- Formally: $m_x = c_x \oplus AES_d(IV \cdot counter)$

## Counter (CTR) mode details
- Both encryption and decryption can be parallelized
- Provides only confidentiality, not integrity or authenticity
  - Confidentiality: because it's encrypted
  - No integrity: there is no hash or checksum
  - No authenticity: a replay attack could occur
    - The plaintext would be scrambled (different counter) but that might not make it "invalid"
- This mode is also known as integer counter mode (ICM) and segmented integer counter (SIC) mode

## Galois/Counter (GCM) mode
[![](https://upload.wikimedia.org/wikipedia/commons/2/25/GCM-Galois_Counter_Mode_with_IV.svg)](https://commons.wikimedia.org/wiki/File:GCM-Galois_Counter_Mode_with_IV.svg)
- Pronounced gal-wa
- A Galois field is a finite field
- Based on the Counter (CTR) mode
- "Counter x" includes the IV as well as the counter
- This is also essentially a stream cipher
- The "len(A) || len(C)" is essentially a hash

<h2 class="r-fit-text">Galois/Counter (GCM) mode details</h2>

[![](https://upload.wikimedia.org/wikipedia/commons/2/25/GCM-Galois_Counter_Mode_with_IV.svg)](https://commons.wikimedia.org/wiki/File:GCM-Galois_Counter_Mode_with_IV.svg)
- Provides confidentiality: because it's encrypted
- Provides integrity: there is a hash involved
  - The "len(A) || len(C)" part
- Provides authentication: "Auth Data 1" is authentication data
  - If it's incorrect, the hash will not match

## CCM mode
- CCM = counter with cipher block chaining message authentication code
- A merging of the counter (CTR) and cipher block chaining (CBC) modes
- Provides the same assurances as GCM: confidentiality, integrity, authentication

## [TLS session protocols](https://en.wikipedia.org/wiki/Transport_Layer_Security#Key_exchange_or_key_agreement)
- Notice the two allowed AES modes for TLS 1.3: GCM & CCM
- And the one that was allowed previously: CBC

## So many encryption modes
- Confidentiality-only modes
  - [Propagating cipher block chaining (PCBC)](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Propagating_cipher_block_chaining_(PCBC))
  - Electronic codebook (ECB)
  - Cipher block chaining (CBC)
  - Cipher feedback (CFB): [Full-block CFB](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Full-block_CFB) and [CFB-1, CFB-8, CFB-64, CFB-128, etc.](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#CFB-1,_CFB-8,_CFB-64,_CFB-128,_etc.)
  - [Output feedback (OFB)](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Output_feedback_(OFB))
  - Counter (CTR)
- Authenticated encryption with additional data (AEAD) modes
  - Galois/counter (GCM)
  - Counter with cipher block chaining message authentication code (CCM)
  - [Synthetic initialization vector (SIV)](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Synthetic_initialization_vector_(SIV))
  - [AES-GCM-SIV](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#AES-GCM-SIV)

<h2 class="r-fit-text">Advanced Encryption Standard (AES)</h2>

## AES Overview
1. Determine number of rounds, $r$:
   - 10, 12, or 14 rounds for 128-bit, 192-bit, and 256-bit keys
1. KeyExpansion: derive round keys from the AES Key Round Schedule
2. Round 1: AddRoundKey: XOR with round key
3. Rounds 2 through $r-1$:
   1. SubBytes: replace bytes as per lookup table
   2. ShiftRows: transpose bytes
   3. MixColumns: matrix operation on the columns
   4. AddRoundKey
4. Final round $r$:
   1. SubBytes
   2. ShiftRows
   3. AddRoundKey

## Key Round Schedule
[![](https://upload.wikimedia.org/wikipedia/commons/b/be/AES-Key_Schedule_128-bit_key.svg)](https://commons.wikimedia.org/wiki/File:AES-Key_Schedule_128-bit_key.svg)
- Let:
 - $N$: key length in 32-bit words (4, 6, or 8)
 - $K_0, K_1, \ldots K_{N-1}$: 32-bit words of the key
 - `RotWord` is a left rotation by 1
 - `SubWord` is a lookup
 - $R_{con}$ is a fixed round constant
- Then repeat the image many times
- It's fixed for that one key, so only has to be computed once

## AES: SubBytes step

- Input is mapped to its mult. inverse in a finite field
- Bytes are replaced based on a fixed substitution table
  - Fixed for all of AES
- Called the [Rijndael S-box](https://en.wikipedia.org/wiki/Rijndael_S-box)

## AES: ShiftRows step
[![](https://upload.wikimedia.org/wikipedia/commons/6/66/AES-ShiftRows.svg)](https://commons.wikimedia.org/wiki/File:AES-ShiftRows.svg)
- Rows are shifted cyclically to the left
- This provides *diffusion*

## AES: MixColumns step
[![](https://upload.wikimedia.org/wikipedia/commons/7/76/AES-MixColumns.svg)](https://commons.wikimedia.org/wiki/File:AES-MixColumns.svg)
- The values are transformed according to the following matrix operation
- This also provides *diffusion*

$\begin{bmatrix} b_{0,j} \\\ b_{1,j} \\\ b_{2,j} \\\ b_{3,j} \end{bmatrix} =
\begin{bmatrix} 2&3&1&1 \\\ 1&2&3&1 \\\ 1&1&2&3 \\\ 3&1&1&2 \end{bmatrix}
\begin{bmatrix} a_{0,j} \\\ a_{1,j} \\\ a_{2,j} \\\ a_{3,j} \end{bmatrix}
\text{ for }0 \le j \le 3
$
</div>

## AES: AddRoundKey step
[![](https://upload.wikimedia.org/wikipedia/commons/a/ad/AES-AddRoundKey.svg)](https://commons.wikimedia.org/wiki/File:AES-AddRoundKey.svg)
- Derive a subkey from the main key via the AES key schedule
- Subkey is then XOR'ed with the state

<h2 class="r-fit-text">Message authentication code (MAC)</h2>

## MAC uses
- Prevents replay attacks
  - A MAC includes a signature (hash, etc.), and combined with a sequence number, will prevent replaying an encrypted packet
  - But why does the sequence number alone not work?
- Used for non-encrypted verification
  - To verify a large downloaded ISO file was correct

## A possible MAC
- Given a message $m$ and a secret key $k$:
  - Create a new value $x = m \cdot k$
  - That's string concatenation
  - Hash that message: $h = hash(x)$
  - Use a good hash function!
  - Encrypt that hash with $k$: $mac = encrypt(h,k)$
    - Likely AES (or similar) with the shared secret key
- In one line: $mac = encrypt(hash(m \cdot k),k)$

## Encryption & Hashing Goals
- *Integrity:* the message canʻt be changed, maliciously or accidentally
  - Formally, a change is always detectable
- *Confidentiality:* the message can't be read by somebody else
  - In other words, itʻs encrypted
- *Authenticity:* the message came from who you think it came from
  - Formally, from an entity that has the appropriate key

<h2 class="r-fit-text">Message Authentication Code (MAC)</h2>

- Like a digital signature, but with a symmetric key
- Provides the *integrity* and *authenticity* but not *confidentiality*
- $MAC(k,m)$ will produce a short bit string that can only be created with the key $k$ and message $m$
  - Hash-based (HMAC) example: 
    - $HMAC(k,m) = SHA256(k \cdot m)$
    - Named for the algorithm used: HMAC-MD5, HMAC-SHA1, etc.
  - Cipher-based (CMAC) example: 
    - $CMAC(k,m) = AES(k,m).truncate(256)$
    - Also named for the cipher used: CMAC-AES, etc.
  - GMAC uses Galois algorithm from AES-GCM

## Why MACs?
- Provides *integrity*
  - An adversary can change the ciphertext; this change would be detected
- Provides *authenticity*
  - A chosen-message attack cannot succeed (canʻt create the MAC)
- Earlier implementations, with just hashes...
  - Were difficult to get right
  - Very error prone and subject to attacks

## HMAC
- A MAC is a general term for a message authentication code
- A HMAC uses a hash as its algorithm
- Not all MACs use hashes
  - A CMAC uses a cipher (AES, DES, etc.) to create the code
    - Likely truncating it to the first $x$ bits
- The example algorithm a few slides back was an HMAC
- Named by their hash algorithm:
  - HMAC-MD5, HMAC-SHA1, etc.

## MAC-then-Encrypt (MtE)
![MtE](https://upload.wikimedia.org/wikipedia/commons/a/ac/Authenticated_Encryption_MtE.png)
- No integrity check of the cipher text, but does for the plain text
- MAC can not be used to infer anything about the plaintext
- This is what TLS does
- Does not provide integrity on the *ciphertext*
 - We have to decrypt to check the integrity of the *plaintext*
- Sources: [ref1](https://link.springer.com/content/pdf/10.1007/3-540-44448-3_41.pdf) and especially [ref2](https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac)

## Encrypt-and-MAC (E&M)
![E&M](https://upload.wikimedia.org/wikipedia/commons/a/a5/Authenticated_Encryption_EaM.png)
- Also no integrity of the cipher text, but yes on the plain text
- MAC can be used to infer information about the plaintext
- This is what ssh does
- Also no integrity on the ciphertext
- Vulnerable to a few types of attacks:
 - Chosen plaintext attacks
 - Malleable cipher attacks
- Sources: [ref1](https://link.springer.com/content/pdf/10.1007/3-540-44448-3_41.pdf) and especially [ref2](https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac)

## Encrypt-then-MAC (EtM)
![EtM](https://upload.wikimedia.org/wikipedia/commons/b/b9/Authenticated_Encryption_EtM.png)
- Provides integrity of the cipher text (and thus the plain text)
 - Thus, if the cipher text is modified, the message is ignored, and prevents any potential attacks on the implementation
- MAC can not be used to infer anything about the plaintext
- Ideal situation:
 - Ciphertext integrity is provided
 - Not vulnerable to the attacks like E&M
- Sources: [ref1](https://link.springer.com/content/pdf/10.1007/3-540-44448-3_41.pdf) and especially [ref2](https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac)