CS 3710

Introduction to Cybersecurity

 

Aaron Bloomfield (aaron@virginia.edu)
@github | |

 

Web Security

Credit Where Credit is Due

Much of the material in this slide set, including some of the images, was obtained from EECS 388 from the CS department at the University of Michigan. That course material was released under a CC BY-SA license. Other images are from Wikipedia, and each of those images is linked to its source on Wikipedia.

Key Exchange

Key Exchange

  • If you want to communicate with another party, how do you create a private key?
    • One solution: Key stores, but only if you both have generated public/private key sets
      • Not always practical for all security protocols
  • One solution: Diffie–Hellman key exchange
    • This version uses discrete logs
    • A variant called ECDH uses elliptic curves

DH key exchange

  • Alice and Bob agree on public values g (generator) and p (prime or modulus)
  • Alice picks a private a and sends to Bob \(A=g^a \text{ mod } p\)
  • Bob picks a private b and sends to Alice \(B=g^b \text{ mod } p\)
  • Because \(g^{ab} \text{ mod } p = g^{ba} \text{ mod } p\), both can compute the secret key
    • Alice received \(g^b \text{ mod } p\) and computes \((g^b \text{ mod } p)^a \text{ mod } p = g^{ab} \text{ mod } p\)
    • Similarly for Bob

DH key exchange example

From Wikipedia; blue are public, red are secret
  1. Alice and Bob agree to use a modulus p = 23 and base g = 5 (which is a primitive root modulo 23).
  2. Alice chooses a secret integer a = 4, then sends Bob A = ga mod p
    • A = 54 mod 23 = 4
  3. Bob chooses a secret integer b = 3, then sends Alice B = gb mod p
    • B = 53 mod 23 = 10
  4. Alice computes s = Ba mod p
    • s = 104 mod 23 = 18
  5. Bob computes s = Ab mod p
    • s = 43 mod 23 = 18
  6. Alice and Bob now share a secret (the number 18)

DH key
exchange
example
as colors

Colors in video (at 2:43)

If that doesn’t work, try the direct Youtube link, but start at 2:43

DH key exchange in practical use

  • Used in many cryptographic protocols
    • Including ones we will see shortly
  • Uses very large numbers created with a very good pseudo-random number generator

MITM versus DH key exchange?

  • Possible if Eve actively controls the communication medium
  • If s/he is only a passive listener, then it is not possible
  • How to avoid an active compromiser:
    • Post on a public forum
    • Use communication methods s/he can’t compromise, such as shortwave radio

Certificates

Definition

  • A certificate is:
    • A public key
    • Information about the key (algorithm used, version number, etc.)
    • An identity (hostname, individual, etc.)
    • Other meta-data (expiration date, etc.)
    • A signature from a certificate authority
  • Most common format is X.509

Diagram of an X.509 cert

Certificate usage

  • All websites that support https provide such a certificate to any client
  • Browsers can display certificate info, such as shown here

Certificate validity

  • All certificates are signed with the (private) key of a certificate authority (CA)
    • Their public keys are well known, and are hard-coded into browsers
    • Or determined by key exchange protocols discussed in this slide set
  • One can verify such a certificate by the (known) public key of the CA
  • One can “self sign” a certificate, but modern browsers typically won’t allow viewing without manual exceptions

Certificate Chains

An entity (such as virginia.edu) can hold an intermediate certificate, which allows it to sign for entities in its division (such as news.virginia.edu)

Trusting Trust

  • A server’s certificate is signed by a Certificate Authority
    • Perhaps through an intermediary certificate
  • So we trust the server’s certificate because the CA vouches for it
  • And we trust the CA because… everybody else does?
  • At some point, we have to trust that the system is well designed and that it works

Certificate Authorities (CAs)

  • Any CA can “vouch” (i.e., sign) for any certificate
    • For any website or any (likely new) CA
  • Thus, the security is based on the weakest CA
  • What happens if a CA is compromised or goes rogue?

CA Revocation

Trusted root cert count

  • For Apple devices: 159 as of Feb 21, 2024
  • For Mozilla / Firefox: 170 as of Feb 21, 2024
  • For Microsoft: 487 as of Feb 21, 2024
  • Other platforms (Chrome, Android, etc.) only seem to list how to find it on the browser (Chrome) or device (Android)

UVA’s Certificate Authority

  • If you ran UVA’s Network Setup Tool, it will have installed it’s certificate provider as a trusted certificate authority
    • Note that the image shows how to make it “fully” trusted; it’s still (mostly) trusted without the slider switch on
  • This allows your device to automatically accept UVA certificates without dealing with separate signatures

SSL & TLS

SSL & TLS

  • They are cryptographic protocols used to encrypt network communication
  • SSL: Secure Sockets Layer
    • v1.0 never released due to flaws
    • v2.0 released in 1995; broken and not used
    • v3.0 released in 1996; broken and not used
  • TLS: Transport Layer Security, the successor to SSL
    • v1.0 released Jan 1999; currently deprecated
    • v1.1 released Apr 2006; currently deprecated
    • v1.2 released Aug 2008; in widespread use
    • v1.3 released Aug 2018; currently being adopted

TLS Usage

  • TLS can be implemented on top of just about any protocol
  • Most common use: when implemented on top of HTTP, that yields HTTPS
  • Also used for encrypting mail (SMTP)
  • A programmer can use it for any other purpose

Forward Secrecy (FS)

  • Also known as perfect forward secrecy (PFS)
  • The concept that a compromise of keys…
    • … including the server’s private keys…
  • Will not allow decryption of past sessions
  • Solution: generate a unique session key for each encrypted connection
    • And discard once done
  • That session key can be communicated via public key cryptography

Public vs. Symmetric Cryptography

  • Public key cryptosystems (such as RSA) are:
    • Very secure: one message will take all the computers in the world longer than the life of the universe to crack
    • Very slow: not feasible for real-time data transmission (downloading a large file, streaming, etc.)
  • They are typically used for key exchange, and then symmetric key cryptography is used for the rest of the communication

TLS 1.2 Handshake, part 1

  • Client opens a network connection to server
  • Client sends hello, including a random number \(R_c\) and supported ciphers
  • Server responds with hello, including a random number \(R_s\), selection of which client-supported cipher to use, its certificate, and requests client certificate
  • Client checks server certificate, and sends its own cert, and a signature of all previous messages sent
    • Signed with the client certificate’s private key
  • Server checks client certificate and signature

TLS 1.2 Handshake, part 2

  • Client generates \(PMS\) (Pre-Master Secret) – a session key – and sends it to server, encrypted with server’s public key
  • Both sides compute \(MS\) (Master Secret) from \(PMS\), \(R_s\), and \(R_c\)
  • Client send a message to switch over to using the \(MS\) for symmetric encryption
    • Server confirms switching over to using the \(MS\) for encryption
  • Both sides terminate the SSL handshake, and use \(MS\) as the symmetric encryption key

Is TLS 1.2 forward-secret?

Is TLS 1.2 forward-secret?

  • No, it is not (generally)
  • Depending on the ciphers used, if the keys are later made available, then it is NOT forward-secret
  • Let’s review the TLS 1.2 protocol
  • What would make it forward-secret?
    • Adding in Diffe-Hellman Key Exchange (DHE)
    • Some allowed TLS 1.2 protocols have DHE, but not all
      • And it’s not required

TLS 1.3

  • Main improvement:
    • Only allow public enryption protocols that are forward-secret (meaning: they use DHE)
  • Other imrpovements:
    • Send more of the information as a single message, so there is less back-and-forth to the handshake
    • Can initiate the connection in 4 messages rather than 6

TLS Ciphers Supported

  • In version 1.3, for key exchange, only forward-secret public key systems are supported
    • Mostly RSA variants with a secure DH key exchange
    • Version 1.2 allows many other algorithms
  • Once the channel is configured, multiple algorithms are supported, the most common being AES

TLS key exchange protocols

Updated version here

TLS session protocols

Updated version here

Initialization Vectors (IVs)

Each successive encrypted block depends on the previous one, so that two encrypted blocks that are the same plaintext will have different ciphertext

Message authentication code (MAC)

  • It’s “a short piece of information used to authenticate a message” (Wikipedia)
  • Like a digital signature, but:
    • uses a shared secret key, so it is faster
  • Like a hash, but:
    • provides authentication, not just a message digest
    • includes the secret key, so two different connections’ MAC of the same text will be different due to different secret keys

A possible MAC

  • Given a message \(m\) and a secret key \(k\):
    • Create a new value \(x = m \cdot k\)
      • That’s string concatenation
    • Hash that message: \(h = hash(x)\)
      • Use a good hash function!
    • Encrypt that hash with \(k\): \(mac = encrypt(h,k)\)
      • Likely AES (or similar) with the shared secret key
  • In one line: \(mac = encrypt(hash(m \cdot k),k)\)

MAC properties

  • Does not encrypt the message!
    • If the connection is not encrypted, then the message is eavesdroppable
  • Cannot be forged without the secret key
    • Any eavesdropper must know the secret key in order to create a valid MAC
  • It is “unforgeable under chosen-message attacks”
    • Even if Eve can get Bob to send chosen text attacks, she needs the (unknown to her) secret key to generate a valid MAC
  • Provides authentication (since only the other party has the secret key)

HMAC

  • A MAC is a general term for a message authentication code
  • A HMAC uses a hash as its algorithm
  • Not all MACs use hashes
    • A CMAC uses a cipher (AES, DES, etc.) to create the code
      • Likely truncating it to the first \(x\) bits
  • The example algorithm a few slides back was an HMAC
  • Named by their hash algorithm:
    • HMAC-MD5, HMAC-SHA1, etc.

HMAC with SHA1

i_pad and o_pad have known and non-secret values

Authenticated Encryption with Associated Data (AEAD)

  • MACs do not encrypt the data, and we want to do so
  • If we combine encryption and a MAC, we get an AEAD
    • Assuming a few properties about the quality of the encryption and MAC function

TLS MAC protocols

TLS MAC protocols Note that even if the MAC doesn’t support encryption (in the first 3 rows), the cipher (likely AES) still does

Updated version here

Attacking HTTPS

Attacks on HTTPS

  • Most are attacks on TLS in some form
  • Like the attacks on RSA, they tend to focus on attacking the protocol, and not cracking the encryption
    • Although there are some exceptions…

Heartbleed (2012)

  • A problem in one implementation of the TLS protocol
    • Specifically the OpenSSL library implementation
  • Affected about one sixth of all websites

Heartbleed

  • RFC 6520 allows a client to send a Heartbeat Request message – like a ping – to the server
    • It consists of a string and its size (as a 16-bit int)
  • The server sends that message back
  • In OpenSSL, the library did not check the size of the message
    • So if you sent “bird” and 65,536, it would send 64Kb of data back
      • Which would be a 64Kb dump of the server’s memory

xkcd on Heartbleed (# 1354)

xkcd on Heartbleed (# 1354)

xkcd on Heartbleed (# 1354)

Mining Your Ps and Qs (2012)

Mining Your Ps and Qs

  • These devices generated their TLS and/or SSH keys at boot-up
    • Either the first time they were powered on or each time they powered up
  • Recall that for RSA, \(n = p*q\)
  • The pseudo-random number generator (PRG) on Linux is of very low quality at boot-up
    • It takes about a minute after boot-up to create good quality keys

Boot-up time versus PRG quality

Source: Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices

Apple Goto Fail (Feb 2014)

  • A stray goto statement kept the Apple libraries from properly checking certificates for almost a year
hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
hashOut.length = SSL_SHA1_DIGEST_LEN;
if ((err = SSLFreeBuffer(&hashCtx)) != 0)
    goto fail;
if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
    goto fail;
    goto fail;  /* MISTAKE! THIS LINE SHOULD NOT BE HERE */
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
    goto fail;

err = sslRawVerify(...)

Null Prefix Attack (2009)

  • String types
    • C strings end in a NUL byte (‘\0’)
    • Pascal style strings have the length at the beginning
      • NUL characters are valid in these strings
  • X.509 certificates use Pascal-style strings
  • Consider obtaining a certificate for:
gmail.com\0.mydomain.com
  • The CA sees some sub-domain off of mydomain.com
  • Browsers see a certificate for gmail.com

More attacks

Privacy

How are you tracked online?

  • Can you list the ways?
  • I listed 10 different ways…

1. “Standard” cookies

  • All web connections are, by definition, stateless
  • A cookie allows one to “save” state
    • Remember who you are, for example
  • So when you return to amazon.com, it knows who you are
  • Most sites that log you in use cookies of some sort
    • Some use server variables, such as UVa’s netbadge

2. Cross-domain content

  • Website A can include an image (or script, or CSS file) from website B
    • Now both A and B can leave cookies
    • This request includes the referring website! So B can see where you have been
  • Part of the headers of any Stack Overflow page:
<script
src="https://ajax.googleapis.com/.../jquery.min.js">
</script>
  • So Google now knows what sites you’ve visited

An example close to home…

  • A recent cavalier daily article loaded 347 (sic) different pieces of content from 52 different domains (33 unique TLDs)
adsafeprotected.com
adsrvr.org
bootstrapcdn.com
cavalierdaily.com
cloudfront.net
digitru.st
doubleclick.net
everesttech.net
facebook.com
facebook.net
google-analytics.com
googleapis.com
google.com
googlesyndication.com
googletagmanager.com
gstatic.com
iasds01.com
imgix.net
mathtag.com
moatads.com
moatpixel.com
openx.net
quantcount.com
quantserve.com
rfihub.com
serving-sys.com
sitescout.com
spotxcdn.com
spotxchange.com
spotx.tv
thesn.net
turn.com
yahoo.com

3. “Tracking” cookies

  • A cookie from a known domain that attempts to track you across multiple sites
    • The difference is that it is known that this particular domain does lots of tracking
    • It is ultimately no different than a regular cookie
  • Considered spyware, they are flagged (and removed) by modern anti-virus scanners

4. Social media comment boxes

  • Those tiny little icons to encourage you to share that page on social media: social media comments
  • It’s actually an iframe with FB content, and obviously includes the referrer
  • Now Facebook knows what articles you read
  • Many social media widgets are similar

5. Javascript tracking code

  • Javascript is basically somebody else’s code running in your (sandboxed) browser
  • One can easily include code to track users, such as via Google analytics
  • Google Analytics is included from http://www.virginia.edu (along with a twitter widget)

6. Non-digital purchases

  • Brick-and-mortar retailers have gotten in on the tracking game also:
    • Grocery store “loyalty cards”
    • Coupons, such as Bed Bath & Beyond’s 20% off coupons
      • They connect it to you even if you pay cash!

7. Canvas elements

  • The canvas element allows graphic-based mark-up, rendering of fonts, etc.
    • The element can be hidden
  • So create a hidden canvas element, and render some text
    • Each browser / OS / set of fonts will render this slightly differently
  • This creates a partial signature to help identify you
    • We’ll see one in a bit…
  • There are canvas blocker extensions available for every major browser

8. Browser fingerprint

  • When you visit a website, your browser sends information about your host system
    • OS (including version), browser (including version), etc.
  • Combine that with a few other things, such as canvas-rendered text …
  • … and you get a possibly unique signature of you

More on browser fingerprints

  • These can be used to track you even after you have cleared your cookies
    • And they can link you to previous visits
  • Is yours unique? Check out https://amiunique.org
    • It shows a canvas-based text rendering

9. Financial payments

  • If you use PayPal, then it knows a lot about what you buy
    • Likewise amazon pay, venmo, etc.
  • These sites have differing levels of privacy statements

10. Geo-location

  • For a device without a GPS, this is easy, but rough
  • For any mobile device with a GPS, this is trivial
    • And very precise – which store you just visited, whose house you are at, etc.
    • And, likely, who else is there with you
  • Google continues to track your mobile device even after you have signed out or opted out (reference)

Two-Factor Authentication

aka Multi-Factor Authentication

Motivation

  • If 2 means are needed to authenticate, then somebody knowing your password will not, by itself, allow them access
  • Variations:
    • SMS-based codes
    • Email-based codes
    • Codes via a phone call
    • 2FA apps with authenticator codes
      • What UVA uses this via Duo Mobile
    • Biometrics (fingerprint, face scan, etc.)
      • Apple devices use these

SMS-based versions

  • A code is sent, via SMS, to your phone
  • Very insecure!
    • SMS is not encrypted
    • It uses a different spectrum, but sniffers are available
  • Phone jacking:
    • Attacker asks phone company to transfer service to a new number, provides SSN (last 4), fake ID, etc.
      • Or asks for a new SIM to be issued
      • See here and here for tips how to prevent
    • All 2FA codes via SMS are now visible
      • Biometric security (face scans, etc.) will not stop this, since that is phone-specific, and the attacker has a new phone

Email-based versions

  • Similar to the SMS version, but slightly more secure
    • Unless somebody has access to your email
    • Or if the email is not sent via encryption, and somebody can eavesdrop on the connection

Google Authenticator

  • This is what Duo Mobile uses; see here for more details
  • Google created the standard, they don’t manage the system
  • A one-time code (typically 6 digits) is needed for the 2FA
  • Two types:
    • HOTP: HMAC-based one-time password
      • Used with the hardware tokens
    • TOTP: Time-based one-time password
  • These are the cornerstones of OATH

HOTP: HMAC-based one-time password

  • Four values are set:
    • \(H\): A hashing algorithm, typically SHA-1
    • \(k\): A secret key (random byte string)
    • \(c\): Counter starting from 0
    • \(d\): the number of digits provided in the 2FA code
  • Algorithm:
    • \(HOTPvalue = HOTP(k,c) \mod 10^d\)
    • \(HOTP(k,c)=truncate(HMAC_H(k,c))\)
    • \(truncate(MAC) = extract31(MAC, MAC[(19 × 8 + 4):(19 × 8 + 7)])\)
    • \(extract31(MAC, i) = MAC[(i × 8 + 1):(i × 8 + 4 × 8 − 1)]\)
    • increment \(c\) afterward

HOTP synchronization issues

  • Each side may increment independently
    • The authenticator will try the next few counter values just in case
  • A resynchronization protocol is defined in RFC 4226
  • To prevent exploiting this via a brute-force attack:
    • Throttle the attempts: block after too many failed attempts or linearly increase the delay after each failed attempt
  • Often used in hardware keys

TOTP: Time-based one-time password

  • Uses the UNIX timestamp (number of seconds since January 1, 1970); we are currently around 1.7 billion (see here)
  • Six values are set:
    • The 4 HOTP values (\(H\), \(k\), \(c\), and \(d\))
    • The “start time” \(T_0\) (typically 0)
    • The time interval \(T_x\), tyipcally 30 seconds
  • Algorithm
    • Uses the HOTP algorithm, but replaces the counter with \(C_T\), a non-decreasing value based on the current time
      • If \(T\) is the current UNIX timestamp, then \(C_T\) is:
        • \(C_T=T-(T \mod 30) = \lfloor \frac{T-T_0}{T_x} \rfloor\)
        • \(TOTPvalue(k) = HOTPvalue(k,C_T)\)

2FA QR Codes

  • The service sets the parameters, and provides a QR code
  • The client accepts those parameters (or not)
  • You can create your own here

Attacking 2FA

  • The wrench method
  • SIM jacking
  • Stealing the phone
  • Setting up 2FA before you do
  • Social engineering
  • Fatigue Attack
    • Keep logging in with the user’s password
      • Each time it prompts a 2FA request
    • Eventually the user becomes so fatigued that they click “accept” to shut the thing up
    • This happened at UVA a few years ago (2023 or so)

Advice

  • Use Duo Mobile (or another authenticator) instead of SMS for your 2FA
    • SIM jacking does not work, since codes are not sent via SMS
    • It’s on your phonse, so requires biometrics to access
    • The data is sent encrypted, unlike SMS or (possibly) email

2FA Warning

Somebody is going to set up 2FA on your accounts. It should probably be you.

If you wait to long, it will be somebody else.