Encryption

Basic Encryption

General Function

  • Input string s
  • Encryption key E
  • Decryption key D
  • E(s) = ciphertext
  • D(E(s)) = s
  • Keys are private. Function is public

Substitution Ciphers & DES

  • A basic method of encryption
  • Maps plaintext of length n to ciphertext with key in table
  • Substitution rule uses s XOR k to encrypt string
  • Finding right key size is important, depends how much security you need
  • Frequency analysis attack is important consideration for sub ciphers

Data Encryption Standard

  • 56 bit key, perform 16 rounds of XOR and shifts on 64 bit chunks of input
  • Not smart enough now, so Triple DES used, where you have 168 bit key, broken into 3 parts. Each key applied to input so DES performed 3 times on input
  • Fast because bit logic (XOR and shifts)

Doesn't work on the web. How do we get a secure key exchange between parties?

Public Key Encryption

  • Public Key distributed freely, private key never shared. You can't derive one from the other
  • Two use cases: 1) You encrypt with public key, they decrypt with private key. This is for confidentiality 2) You encrypt with private key, they decrypt with public key. This is for authenticity, and signatures

  • Relies on trapdoor functions, which are easy to compute but difficult to invert. Uses Prime Factorization and these theorems:

  • Fermat's Little Theorem For any prime p, and any int a, a^p = is congruent to a (mod p) or $$a^p (mod p) = a$$

    $$a^p-1 = 1 (modp)$$

  • Chinese Remainder Theorem Given $$x = a (mod p)$$, where a = 1, for any pair of equations $$x = 1 mod (z)$$ and $$x = 1 mod (y)$$ where z and y are relatively prime, we have $$x = 1 (mod zy)$$

Crypto Steps Using Public/Private Key Method

  1. We choose $$n = pq$$, where p and q are large primes. Extremely difficult to find p and q, because we are only given n and they are primes
  2. $$lamba = (p-1)(q-1)$$
  3. Choose e s.t. $$e < lambda$$
  4. Choose d s.t $$de = 1 mod (lambda)$$
  5. n,e are public key. n, d are private key
  6. encrypt(m) = $$m^e (modn) = c$$
  7. decrypt(c) = $$c^d (modn) = m$$ Which is derived using above theorems

Security

  • Goals for security include confidentiality/privacy, data integrity, service integrity (availability), authenticity, non-repudiation
  • You can write fake IP into outgoing packets

Masquerading Modifies request from client and then message back to client from server

Replay attack Records information (e.g. uname/password) and replays it in future

Defenses

  • All parties establish identification to each other with authentication

Public Key Infrastructure

Server sends certificate to client that client authenticates with Certificate Authorities (big ones built into browsers)

  • Public/private key done initially, then the rest of the session between client/server is done with DES (less expensive)
  • TLS/SSL (transport layer security, secure-sockets layer)

Signatures

  • Hash your message. Sign that message with the private key. Send hash to recipient. If message wasn't tampered with, then the public decryption will equal hash
  • MD5 Hash Algorithm: Divide message into 512 bit blocks and hash each block
  • Add date or random string (nonce) to hash before signing to prevent reused signatures
  • For third party sign on: authenticate user (OpenID) or authorize request (OpenAuth). Uses URL encoding to communicate back and forth

Web Specific Attacks

Scripting Attacks

  • Cross Site Scripting Inject code into a page on a site to steal data, cookies, and wreak havoc on the browsers of the people who are using the compromised site
  • Sybil Compromising trusted peer-to-peer networks. CAPTCHA usually can stop this

Privacy Attacks

  • Authorization (granting privileges to authenticated parties)

Database Privacy: How do you accomplish decent database privacy and anonymization?

  • K-Anonymity: Couple together ranges in the data so that it is not a clear one-to-one relation
  • Differential Privacy: Use statistics to change reported values randomly so the answers obtained are the same whether or not that single answer is in the database. Should be used when the questions are continuous. Repeated queries pose a threat.