CPS110: General security Landon Cox. Intro: general security Hardware reality Processor, memory,...

CPS110: General security

Landon Cox

Intro: general security

Hardware reality Processor, memory, disks, NIC Components can be used by

anyone OS abstraction

Controlled access to hardware

Security abstractions

What HW primitives exist for access control? Processor mode bit (kernel/user mode) Protected instructions (I/O instructions, halt) Protected data (page tables, interrupt vector)

Want to build two abstractions on these Identity (authentication)

Who are you?

Security policy (authorization) What are you allowed to do?

Building up from the hardware

One of the themes of this class Threads

Atomic test&set, interrupt enable/disable Transactions

Single disk sector write Reliable communication

Atomic packet send

Building up from the hardware

Already built on top of HW security primitives

Secure sharing of physical memory Use protection between address spaces (fault on unauthorized access)

Secure sharing of kernel services Use system calls to safely transition to

privileged mode (trap after syscall instruction)

Authentication: who are you?

Prove your identity to the OS Many ways to authenticate

1. Passwords2. Physical tokens3. Biometrics

Password authentication

Password Shared secret between you and the OS

Several weaknesses How should we store passwords?

Cryptographic hashes (MD5, SHA) Hashes are one-way functions Check that hash(input) =

hash(password)

Cryptographic hashes

Infeasible to reverse (even for the system!) I.e. cannot compute plaintext from digest

Collision-resistant hash function Probability (collision) < probability (HW error)

Extremely useful tool in lots of domains Can be used as a shorthand for naming objects (e.g. files)

SHA1 hashSHA1 hash

160 bitsArbitrarily large

“Hash digest”

Password authentication

What’s the weakest link in password systems? People!

People choose short passwords (brute force attacks take less time)

People choose easy-to-remember passwords (narrows the search space: dictionary attack)

People choose the same password for everything (break the weakest web site, gain access to the strongest) How could you use cryptographic hashes to solve this? User remembers one password, systems sends Hash(pass + site

name)

Weak passwords Consider an 8-character password 256^8 possible passwords

2^64 ~ 16 * billion * billion

If you only choose lower-case letters 26^8 ~ 200 billion

If you choose a word from the dictionary 320,000 words in Webster’s unabridged 1000/second find a password in 5 minutes

Researchers ran an attack on Michigan passwords Recovered thousands; some of the most popular: “beer”,

“hockey” Similar result at Berkeley CS in the 80s: “gandolph”

Physical token authentication

Require something physical E.g. a ticket to a dance performance

What if your token is stolen/forged? Require a physical token + password E.g. your ATM card + PIN

Biometric authentication

Essentially a physical token One that should be hard to steal/forge

Examples Thumbprint, retina scan, signature

Most have accuracy problems False positives (accept invalid person) False negatives (reject valid person)

Real-world authentication

How to authenticate to your credit card company? Give them your social security number They ask over the phone and check

How is this vulnerable? Company has my SSN, so they can pretend to be

me (or someone who breaks into them can pretend

to be me) Phone line snoop can pretend to be me

Authorization: what can you do?

Fundamental structure Access control matrix Rows = authentication domains Columns = objects

Two approaches to storing data: Access control lists and capabilities

File 1 File 2 File 3

User 1 RW RW RW

User 2 -- R RW

Access control lists

Standard for file systems At each object

Store who can access the object Store how the object can be accessed

On each access Check that the user has proper permissions

Use groups to make things more convenient E.g. forbes, chase, and lpcox are all in group

“prof”

Access control lists

How can you defeat ACL authorization? Villain convinces the OS it is someone else

Example Sendmail runs as root (full privileges) Attacker compromises sendmail process Can now run arbitrary code under root ID This is what you will be doing in Project 3

What if you get the ACL wrong?

Frighteningly common

“Usability and privacy: a study of Kazaa…” Good and Krekelberg, SigCHI 2003 In 12 hours, found 150 inboxes on Kazaa Observed people downloading them

Many other examples: web, NFS, etc People don’t understand software

interactions Affects even diligent users …

Data breach

Alice Bob

Despite her best efforts, Alice’s data is only as

secure as her least competent confidant.

Snoop

What to do?

Two goals1. Identify sensitive files2. Infer who is allowed to view them

Many more constraints1. Don’t impact performance2. Don’t bother the user3. Remain backwards-compatible

Identifying sensitive files

Approach: identify the common case I download sensitive docs from many sources

mail.cs.duke.edu

What do the examples have in common?Encryption!

Identifying sensitive files Approach: identify the common case

I download sensitive docs from many sources

Insight: data is often encrypted for a reason In our examples, servers allow clients to cache sensitive file To protect those files in the network, communication is encrypted

How can we tell if network communication is encrypted? Use the port (e.g., 443 is used for HTTPS) Ciphertext exhibits high “entropy” or randomness Entropy = information density, measured in bits/byte

Can inexpensively measure the entropy of every socket If data stream exhibits high entropy, see if “resulting data” hits FS If so, tag the file as sensitive

Download entropy scores

How can we distinguish compression from encryption? if compressed format, use data-similarity if actual compression, measure relative data volumes

Capabilities

At each user Store a list of objects they may access Store how those objects may be accessed

Example At user2, store ‹file2, r: file3, rw›

On each access User presents capability Capability proves they are authorized

Capabilities

Possession of capability provides authorization Like car keys If you have the door key, you can get in the car If you have the ignition key, you can drive it

How can you defeat a capability system? Forge an authorized user’s capabilities Like making a copy of someone’s car keys

Capabilities

Solution to forged capabilities? “Self-authenticating” capabilities E.g. include {user, filename}system-key

Mastercard number, expiration date Have these, you can charge to account

What makes these hard to forge? Random numbers in a sparse space

Revocation

How to revoke permissions with an ACL? Just delete/change the ACL entry

How to revoke permissions with capabilities? Not nearly as easy Could change the car door’s lock This revokes access for everyone with the

key

Dealing with security

Principle of least privilege Users get minimum privilege to do their work

Defense in depth Create multiple levels Lower-levels monitor higher ones

Minimize trusted computing base Every system has a trusted part (e.g. the thing that checks the ACL) This part has special privileges It cannot be subject to the security mechanism

Reducing the trusted base

Why reduce the size of the trusted base? Small and simple = easier to reason about Hopefully fewer bugs and security loopholes

Most systems Kernel is trusted Special user (root, Administrator) is trusted

Emerging systems Virtual machines: trust the hypervisor Trusted Platform Modules: hardware attests to

software

Common attacks

Hidden channels Buffer overflow Trojan horse

Hidden channels

Get system to communicate in unintended ways Example: tenex (supposedly secure OS)

Created a team to break in Team had all passwords within 48 hours … oops.

Goal: require 256^8 tries to see if password is right

Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } }

Hidden channels: tenex

How to break? (hint: vm faults are visible) Specially arrange the input’s layout in memory Force a page fault if second character is read If you get a fault, the first character was right Do again for third, fourth, … eighth character

Can check the password in 256*8 tries

Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } }

Buffer overflow

Suppose you have an on-stack buffer You read input into the buffer You don’t check the length of the input

A villain’s long input can corrupt the stack If they’re smart, they can corrupt the return value Allows villain to jump to their own code

First Internet worm did this Creator was a Cornell grad student, Robert Morris Son of a famous computer security guru He is now a professor at MIT

Trojan horse

Basic idea Give somebody something useful Make it do something evil

Examples Replace login with something that emails

passwords Send a Word document with a malicious macro Download Kazaa and install spyware Install a Sony CD and have it install a buggy DRM

root-kit

Why security is so hard

Ken Thompson’s self-replicating code

Goal: put a backdoor into login Allow “ken” to login as root w/o

password

1.Make it possible (easy)2.Hide it (tricky)

Login backdoor Step 1: modify login.c

(code A) if (name == “ken”) login as root This is obvious so how do we hide it?

Step 2: modify C compiler (code B) if (compiling login.c) compile A into binary Removes code A from login.c, keeps backdoor This is now obvious in the compiler, how do we hide it?

Step 3: distribute a buggy C compiler binary (code C) if (compiling C compiler) compile code B into binary No trace of attack in any source code (only in C compiler’s binary)

Upshot: everything is bootstrapped, everything trusts something

Conficker

Conficker and DNS

Problem: how to update your malware? Why not just host at a web site, have malware

check? Easy to catch Monitor malware, see which web sites they connect to Follow the paper trail from that site

Instead, each malware client computes random DNS names

The name of each generated domain is 4 to 10 characters, to which a randomly selected TLD is appended from the following list of 116 suffix (mapping to 110 TLDs):

[ "ac" , "ae" , "ag" , "am" , "as" , "at" , "be" , "bo" , "bz" , "ca" , "cd" , "ch" , "cl" , "cn" , "co.cr" , "co.id" , "co.il" , "co.ke" , "co.kr" , "co.nz" , "co.ug" , "co.uk" , "co.vi" , "co.za" , "com.ag" , "com.ai" , "com.ar" , "com.bo" , "com.br" , "com.bs" , "com.co" , "com.do" , "com.fj" , "com.gh" , "com.gl" , "com.gt" , "com.hn" , "com.jm" , "com.ki" , "com.lc" , "com.mt" , "com.mx" , "com.ng" , "com.ni" , "com.pa" , "com.pe" , "com.pr" , "com.pt" , "com.py" , "com.sv" , "com.tr" , "com.tt" , "com.tw" , "com.ua" , "com.uy" , "com.ve" , "cx" , "cz" , "dj" , "dk" , "dm" , "ec" , "es" , "fm" , "fr" , "gd" , "gr" , "gs" , "gy" , "hk" , "hn" , "ht" , "hu" , "ie" , "im" , "in" , "ir" , "is" , "kn" , "kz" , "la" , "lc" , "li" , "lu" , "lv" , "ly" , "md" , "me" , "mn" , "ms" , "mu" , "mw" , "my" , "nf" , "nl" , "no" , "pe" , "pk" , "pl" , "ps" , "ro" , "ru" , "sc" , "sg" , "sh" , "sk" , "su" , "tc" , "tj" , "tl" , "tn" , "to" , "tw" , "us" , "vc" , "vn" ]

Conficker and DNS

Checks to see if they can download from site until it hits Why is it doing this? What aspects of DNS does this take advantage of?

Coming up

More security (project 3) Storage and files Google

CPS110: General security Landon Cox. Intro: general security Hardware reality Processor, memory,...

Documents

Transcript of CPS110: General security Landon Cox. Intro: general security Hardware reality Processor, memory,...