CS 3710

Introduction to Cybersecurity

Aaron Bloomfield (aaron@virginia.edu)
@github | ↑ |

Web Security

Credit Where Credit is Due

Much of the material in this slide set, including some of the images, was obtained from EECS 388 from the CS department at the University of Michigan. That course material was released under a CC BY-SA license. Other images are from Wikipedia, and each of those images is linked to its source on Wikipedia.

Key Exchange

If you want to communicate with another party, how do you create a private key?
- One solution: Key stores, but only if you both have generated public/private key sets
  - Not always practical for all security protocols
One solution: Diffie–Hellman key exchange
- This version uses discrete logs
- A variant called ECDH uses elliptic curves

DH key exchange

Alice and Bob agree on public values g (generator) and p (prime or modulus)
Alice picks a private a and sends to Bob \(A=g^a \text{ mod } p\)
Bob picks a private b and sends to Alice \(B=g^b \text{ mod } p\)
Because \(g^{ab} \text{ mod } p = g^{ba} \text{ mod } p\), both can compute the secret key
- Alice received \(g^b \text{ mod } p\) and computes \((g^b \text{ mod } p)^a \text{ mod } p = g^{ab} \text{ mod } p\)
- Similarly for Bob

DH key exchange example

From Wikipedia; blue are public, red are secret

Alice and Bob agree to use a modulus p = 23 and base g = 5 (which is a primitive root modulo 23).
Alice chooses a secret integer a = 4, then sends Bob A = g^a mod p
- A = 5⁴ mod 23 = 4
Bob chooses a secret integer b = 3, then sends Alice B = g^b mod p
- B = 5³ mod 23 = 10
Alice computes s = B^a mod p
- s = 10⁴ mod 23 = 18
Bob computes s = A^b mod p
- s = 4³ mod 23 = 18
Alice and Bob now share a secret (the number 18)

DH key
exchange
example
as colors

Colors in video (at 2:43)

If that doesn’t work, try the direct Youtube link, but start at 2:43

DH key exchange in practical use

Used in many cryptographic protocols
- Including ones we will see shortly
Uses very large numbers created with a very good pseudo-random number generator

MITM versus DH key exchange?

Possible if Eve actively controls the communication medium
If s/he is only a passive listener, then it is not possible
How to avoid an active compromiser:
- Post on a public forum
- Use communication methods s/he can’t compromise, such as shortwave radio

Certificates

Definition

A certificate is:
- A public key
- Information about the key (algorithm used, version number, etc.)
- An identity (hostname, individual, etc.)
- Other meta-data (expiration date, etc.)
- A signature from a certificate authority
Most common format is X.509

Diagram of an X.509 cert

Certificate usage

All websites that support https provide such a certificate to any client
Browsers can display certificate info, such as shown here

Certificate validity

All certificates are signed with the (private) key of a certificate authority (CA)
- Their public keys are well known, and are hard-coded into browsers
- Or determined by key exchange protocols discussed in this slide set
One can verify such a certificate by the (known) public key of the CA
One can “self sign” a certificate, but modern browsers typically won’t allow viewing without manual exceptions

Certificate Chains

An entity (such as virginia.edu) can hold an intermediate certificate, which allows it to sign for entities in its division (such as news.virginia.edu)

Trusting Trust

A server’s certificate is signed by a Certificate Authority
- Perhaps through an intermediary certificate
So we trust the server’s certificate because the CA vouches for it
And we trust the CA because… everybody else does?
At some point, we have to trust that the system is well designed and that it works

Certificate Authorities (CAs)

Any CA can “vouch” (i.e., sign) for any certificate
- For any website or any (likely new) CA
Thus, the security is based on the weakest CA
What happens if a CA is compromised or goes rogue?

CA Revocation

This has happened! In February 2018.
- Google to kill Symantec certs in Chrome 66, due in early 2018
- Firefox also: Distrust of Symantec TLS Certificates
The reason: a site called Trustico resold Symantec-signed certificates
- The 23,000 certificates Trustico sold were “compromised”
  - They still had the private keys (a big no-no)
- They were sold under the umbrella of Symantec, so Symantec was party at fault as well

Trusted root cert count

For Apple devices: 159 as of Feb 21, 2024
For Mozilla / Firefox: 170 as of Feb 21, 2024
For Microsoft: 487 as of Feb 21, 2024
Other platforms (Chrome, Android, etc.) only seem to list how to find it on the browser (Chrome) or device (Android)

UVA’s Certificate Authority

If you ran UVA’s Network Setup Tool, it will have installed it’s certificate provider as a trusted certificate authority
- Note that the image shows how to make it “fully” trusted; it’s still (mostly) trusted without the slider switch on
This allows your device to automatically accept UVA certificates without dealing with separate signatures

SSL & TLS

They are cryptographic protocols used to encrypt network communication
SSL: Secure Sockets Layer
- v1.0 never released due to flaws
- v2.0 released in 1995; broken and not used
- v3.0 released in 1996; broken and not used
TLS: Transport Layer Security, the successor to SSL
- v1.0 released Jan 1999; currently deprecated
- v1.1 released Apr 2006; currently deprecated
- v1.2 released Aug 2008; in widespread use
- v1.3 released Aug 2018; currently being adopted

TLS Usage

TLS can be implemented on top of just about any protocol
Most common use: when implemented on top of HTTP, that yields HTTPS
Also used for encrypting mail (SMTP)
A programmer can use it for any other purpose

Forward Secrecy (FS)

Also known as perfect forward secrecy (PFS)
The concept that a compromise of keys…
- … including the server’s private keys…
Will not allow decryption of past sessions
Solution: generate a unique session key for each encrypted connection
- And discard once done
That session key can be communicated via public key cryptography

Public vs. Symmetric Cryptography

Public key cryptosystems (such as RSA) are:
- Very secure: one message will take all the computers in the world longer than the life of the universe to crack
- Very slow: not feasible for real-time data transmission (downloading a large file, streaming, etc.)
They are typically used for key exchange, and then symmetric key cryptography is used for the rest of the communication

TLS 1.2 Handshake, part 1

Client opens a network connection to server
Client sends hello, including a random number \(R_c\) and supported ciphers
Server responds with hello, including a random number \(R_s\), selection of which client-supported cipher to use, its certificate, and requests client certificate
Client checks server certificate, and sends its own cert, and a signature of all previous messages sent
- Signed with the client certificate’s private key
Server checks client certificate and signature

TLS 1.2 Handshake, part 2

Client generates \(PMS\) (Pre-Master Secret) – a session key – and sends it to server, encrypted with server’s public key
Both sides compute \(MS\) (Master Secret) from \(PMS\), \(R_s\), and \(R_c\)
Client send a message to switch over to using the \(MS\) for symmetric encryption
- Server confirms switching over to using the \(MS\) for encryption
Both sides terminate the SSL handshake, and use \(MS\) as the symmetric encryption key

Is TLS 1.2 forward-secret?

No, it is not (generally)
Depending on the ciphers used, if the keys are later made available, then it is NOT forward-secret
Let’s review the TLS 1.2 protocol
What would make it forward-secret?
- Adding in Diffe-Hellman Key Exchange (DHE)
- Some allowed TLS 1.2 protocols have DHE, but not all
  - And it’s not required

TLS 1.3

Main improvement:
- Only allow public enryption protocols that are forward-secret (meaning: they use DHE)
Other imrpovements:
- Send more of the information as a single message, so there is less back-and-forth to the handshake
- Can initiate the connection in 4 messages rather than 6

TLS Ciphers Supported

In version 1.3, for key exchange, only forward-secret public key systems are supported
- Mostly RSA variants with a secure DH key exchange
- Version 1.2 allows many other algorithms
Once the channel is configured, multiple algorithms are supported, the most common being AES

TLS key exchange protocols

Updated version here

TLS session protocols

Updated version here

Initialization Vectors (IVs)

Each successive encrypted block depends on the previous one, so that two encrypted blocks that are the same plaintext will have different ciphertext

Message authentication code (MAC)

It’s “a short piece of information used to authenticate a message” (Wikipedia)
Like a digital signature, but:
- uses a shared secret key, so it is faster
Like a hash, but:
- provides authentication, not just a message digest
- includes the secret key, so two different connections’ MAC of the same text will be different due to different secret keys

A possible MAC

Given a message \(m\) and a secret key \(k\):
- Create a new value \(x = m \cdot k\)
  - That’s string concatenation
- Hash that message: \(h = hash(x)\)
  - Use a good hash function!
- Encrypt that hash with \(k\): \(mac = encrypt(h,k)\)
  - Likely AES (or similar) with the shared secret key
In one line: \(mac = encrypt(hash(m \cdot k),k)\)

MAC properties

Does not encrypt the message!
- If the connection is not encrypted, then the message is eavesdroppable
Cannot be forged without the secret key
- Any eavesdropper must know the secret key in order to create a valid MAC
It is “unforgeable under chosen-message attacks”
- Even if Eve can get Bob to send chosen text attacks, she needs the (unknown to her) secret key to generate a valid MAC
Provides authentication (since only the other party has the secret key)

HMAC

A MAC is a general term for a message authentication code
A HMAC uses a hash as its algorithm
Not all MACs use hashes
- A CMAC uses a cipher (AES, DES, etc.) to create the code
  - Likely truncating it to the first \(x\) bits
The example algorithm a few slides back was an HMAC
Named by their hash algorithm:
- HMAC-MD5, HMAC-SHA1, etc.

HMAC with SHA1

i_pad and o_pad have known and non-secret values

Authenticated Encryption with Associated Data (AEAD)

MACs do not encrypt the data, and we want to do so
If we combine encryption and a MAC, we get an AEAD
- Assuming a few properties about the quality of the encryption and MAC function

TLS MAC protocols

TLS MAC protocols Note that even if the MAC doesn’t support encryption (in the first 3 rows), the cipher (likely AES) still does

Updated version here

Attacking HTTPS

Attacks on HTTPS

Most are attacks on TLS in some form
Like the attacks on RSA, they tend to focus on attacking the protocol, and not cracking the encryption
- Although there are some exceptions…

Heartbleed (2012)

A problem in one implementation of the TLS protocol
- Specifically the OpenSSL library implementation
Affected about one sixth of all websites

Heartbleed

RFC 6520 allows a client to send a Heartbeat Request message – like a ping – to the server
- It consists of a string and its size (as a 16-bit int)
The server sends that message back
In OpenSSL, the library did not check the size of the message
- So if you sent “bird” and 65,536, it would send 64Kb of data back
  - Which would be a 64Kb dump of the server’s memory

xkcd on Heartbleed (# 1354)

Mining Your Ps and Qs (2012)

In 2012 it was found that 5% of TLS servers and 9% of SSH servers shared common key material
- Also, one could compute keys for about 0.5% of TLS hosts (64,000) an 1% of SSH hosts (108,000)
- Most were headless or embedded devices
Reference: Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices, USENIX ’12
But why?

Mining Your Ps and Qs

These devices generated their TLS and/or SSH keys at boot-up
- Either the first time they were powered on or each time they powered up
Recall that for RSA, \(n = p*q\)
The pseudo-random number generator (PRG) on Linux is of very low quality at boot-up
- It takes about a minute after boot-up to create good quality keys

Boot-up time versus PRG quality

Source: Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices

Apple Goto Fail (Feb 2014)

A stray goto statement kept the Apple libraries from properly checking certificates for almost a year

hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
hashOut.length = SSL_SHA1_DIGEST_LEN;
if ((err = SSLFreeBuffer(&hashCtx)) != 0)
    goto fail;
if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
    goto fail;
    goto fail;  /* MISTAKE! THIS LINE SHOULD NOT BE HERE */
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
    goto fail;

err = sslRawVerify(...)

Null Prefix Attack (2009)

String types
- C strings end in a NUL byte (‘\0’)
- Pascal style strings have the length at the beginning
  - NUL characters are valid in these strings
X.509 certificates use Pascal-style strings
Consider obtaining a certificate for:

gmail.com\0.mydomain.com

The CA sees some sub-domain off of mydomain.com
Browsers see a certificate for gmail.com

More attacks

Wikipedia lists a dozen or so

Privacy

How are you tracked online?

Can you list the ways?
I listed 10 different ways…

1. “Standard” cookies

All web connections are, by definition, stateless
A cookie allows one to “save” state
- Remember who you are, for example
So when you return to amazon.com, it knows who you are
Most sites that log you in use cookies of some sort
- Some use server variables, such as UVa’s netbadge

2. Cross-domain content

Website A can include an image (or script, or CSS file) from website B
- Now both A and B can leave cookies
- This request includes the referring website! So B can see where you have been
Part of the headers of any Stack Overflow page:

<script
src="https://ajax.googleapis.com/.../jquery.min.js">
</script>

So Google now knows what sites you’ve visited

An example close to home…

A recent cavalier daily article loaded 347 (sic) different pieces of content from 52 different domains (33 unique TLDs)

adsafeprotected.com
adsrvr.org
bootstrapcdn.com
cavalierdaily.com
cloudfront.net
digitru.st
doubleclick.net
everesttech.net
facebook.com
facebook.net
google-analytics.com
googleapis.com
google.com
googlesyndication.com
googletagmanager.com
gstatic.com
iasds01.com
imgix.net
mathtag.com
moatads.com
moatpixel.com
openx.net
quantcount.com
quantserve.com
rfihub.com
serving-sys.com
sitescout.com
spotxcdn.com
spotxchange.com
spotx.tv
thesn.net
turn.com
yahoo.com

3. “Tracking” cookies

A cookie from a known domain that attempts to track you across multiple sites
- The difference is that it is known that this particular domain does lots of tracking
- It is ultimately no different than a regular cookie
Considered spyware, they are flagged (and removed) by modern anti-virus scanners

5. Javascript tracking code

Javascript is basically somebody else’s code running in your (sandboxed) browser
One can easily include code to track users, such as via Google analytics
Google Analytics is included from http://www.virginia.edu (along with a twitter widget)

6. Non-digital purchases

Brick-and-mortar retailers have gotten in on the tracking game also:
- Grocery store “loyalty cards”
- Coupons, such as Bed Bath & Beyond’s 20% off coupons
  - They connect it to you even if you pay cash!

7. Canvas elements

The canvas element allows graphic-based mark-up, rendering of fonts, etc.
- The element can be hidden
So create a hidden canvas element, and render some text
- Each browser / OS / set of fonts will render this slightly differently
This creates a partial signature to help identify you
- We’ll see one in a bit…
There are canvas blocker extensions available for every major browser

8. Browser fingerprint

When you visit a website, your browser sends information about your host system
- OS (including version), browser (including version), etc.
Combine that with a few other things, such as canvas-rendered text …
… and you get a possibly unique signature of you

9. Financial payments

If you use PayPal, then it knows a lot about what you buy
- Likewise amazon pay, venmo, etc.
These sites have differing levels of privacy statements

10. Geo-location

For a device without a GPS, this is easy, but rough
- Go to https://www.iplocation.net/; they were within 2.2 miles for me
For any mobile device with a GPS, this is trivial
- And very precise – which store you just visited, whose house you are at, etc.
- And, likely, who else is there with you
Google continues to track your mobile device even after you have signed out or opted out (reference)

Two-Factor Authentication

aka Multi-Factor Authentication

Motivation

If 2 means are needed to authenticate, then somebody knowing your password will not, by itself, allow them access
Variations:
- SMS-based codes
- Email-based codes
- Codes via a phone call
- 2FA apps with authenticator codes
  - What UVA uses this via Duo Mobile
- Biometrics (fingerprint, face scan, etc.)
  - Apple devices use these

SMS-based versions

A code is sent, via SMS, to your phone
Very insecure!
- SMS is not encrypted
- It uses a different spectrum, but sniffers are available
Phone jacking:
- Attacker asks phone company to transfer service to a new number, provides SSN (last 4), fake ID, etc.
  - Or asks for a new SIM to be issued
  - See here and here for tips how to prevent
- All 2FA codes via SMS are now visible
  - Biometric security (face scans, etc.) will not stop this, since that is phone-specific, and the attacker has a new phone

Email-based versions

Similar to the SMS version, but slightly more secure
- Unless somebody has access to your email
- Or if the email is not sent via encryption, and somebody can eavesdrop on the connection

Google Authenticator

This is what Duo Mobile uses; see here for more details
Google created the standard, they don’t manage the system
A one-time code (typically 6 digits) is needed for the 2FA
Two types:
- HOTP: HMAC-based one-time password
  - Used with the hardware tokens
- TOTP: Time-based one-time password
These are the cornerstones of OATH

HOTP: HMAC-based one-time password

Four values are set:
- \(H\): A hashing algorithm, typically SHA-1
- \(k\): A secret key (random byte string)
- \(c\): Counter starting from 0
- \(d\): the number of digits provided in the 2FA code
Algorithm:
- \(HOTPvalue = HOTP(k,c) \mod 10^d\)
- \(HOTP(k,c)=truncate(HMAC_H(k,c))\)
- \(truncate(MAC) = extract31(MAC, MAC[(19 × 8 + 4):(19 × 8 + 7)])\)
- \(extract31(MAC, i) = MAC[(i × 8 + 1):(i × 8 + 4 × 8 − 1)]\)
- increment \(c\) afterward

HOTP synchronization issues

Each side may increment independently
- The authenticator will try the next few counter values just in case
A resynchronization protocol is defined in RFC 4226
To prevent exploiting this via a brute-force attack:
- Throttle the attempts: block after too many failed attempts or linearly increase the delay after each failed attempt
Often used in hardware keys

TOTP: Time-based one-time password

Uses the UNIX timestamp (number of seconds since January 1, 1970); we are currently around 1.7 billion (see here)
Six values are set:
- The 4 HOTP values (\(H\), \(k\), \(c\), and \(d\))
- The “start time” \(T_0\) (typically 0)
- The time interval \(T_x\), tyipcally 30 seconds
Algorithm
- Uses the HOTP algorithm, but replaces the counter with \(C_T\), a non-decreasing value based on the current time
  - If \(T\) is the current UNIX timestamp, then \(C_T\) is:
    - \(C_T=T-(T \mod 30) = \lfloor \frac{T-T_0}{T_x} \rfloor\)
    - \(TOTPvalue(k) = HOTPvalue(k,C_T)\)

2FA QR Codes

The service sets the parameters, and provides a QR code
The client accepts those parameters (or not)
You can create your own here

Attacking 2FA

The wrench method
SIM jacking
Stealing the phone
Setting up 2FA before you do
Social engineering
Fatigue Attack
- Keep logging in with the user’s password
  - Each time it prompts a 2FA request
- Eventually the user becomes so fatigued that they click “accept” to shut the thing up
- This happened at UVA a few years ago (2023 or so)

Advice

Use Duo Mobile (or another authenticator) instead of SMS for your 2FA
- SIM jacking does not work, since codes are not sent via SMS
- It’s on your phonse, so requires biometrics to access
- The data is sent encrypted, unlike SMS or (possibly) email

2FA Warning

Somebody is going to set up 2FA on your accounts. It should probably be you.

If you wait to long, it will be somebody else.

CS 3710

Introduction to Cybersecurity

Web Security

Credit Where Credit is Due

Key Exchange

Key Exchange

DH key exchange

DH key exchange example

DH keyexchangeexampleas colors

Colors in video (at 2:43)

DH key exchange in practical use

MITM versus DH key exchange?

Certificates

Definition

Diagram of an X.509 cert

Certificate usage

Certificate validity

Certificate Chains

Trusting Trust

Certificate Authorities (CAs)

CA Revocation

Trusted root cert count

UVA’s Certificate Authority

SSL & TLS

SSL & TLS

TLS Usage

Forward Secrecy (FS)

Public vs. Symmetric Cryptography

TLS 1.2 Handshake, part 1

TLS 1.2 Handshake, part 2

Is TLS 1.2 forward-secret?

Is TLS 1.2 forward-secret?

TLS 1.3

TLS Ciphers Supported

TLS key exchange protocols

TLS session protocols

Initialization Vectors (IVs)

Message authentication code (MAC)

A possible MAC

MAC properties

HMAC

HMAC with SHA1

Authenticated Encryption with Associated Data (AEAD)

TLS MAC protocols

Attacking HTTPS

Attacks on HTTPS

Heartbleed (2012)

Heartbleed

xkcd on Heartbleed (# 1354)

xkcd on Heartbleed (# 1354)

xkcd on Heartbleed (# 1354)

Mining Your Ps and Qs (2012)

Mining Your Ps and Qs

Boot-up time versus PRG quality

Apple Goto Fail (Feb 2014)

Null Prefix Attack (2009)

More attacks

Privacy

How are you tracked online?

1. “Standard” cookies

Cookie exchange

What a cookie stores

2. Cross-domain content

An example close to home…

3. “Tracking” cookies

4. Social media comment boxes

5. Javascript tracking code

6. Non-digital purchases

7. Canvas elements

8. Browser fingerprint

More on browser fingerprints

9. Financial payments

10. Geo-location

Two-Factor Authentication

Motivation

SMS-based versions

Email-based versions

Google Authenticator

HOTP: HMAC-based one-time password

HOTP synchronization issues

DH key
exchange
example
as colors