CPS110: General security Landon Cox. Intro: general security Hardware reality Processor, memory,...
-
Upload
alan-stone -
Category
Documents
-
view
212 -
download
0
Transcript of CPS110: General security Landon Cox. Intro: general security Hardware reality Processor, memory,...
CPS110: General security
Landon Cox
Intro: general security
Hardware reality Processor, memory, disks, NIC Components can be used by
anyone OS abstraction
Controlled access to hardware
Security abstractions
What HW primitives exist for access control? Processor mode bit (kernel/user mode) Protected instructions (I/O instructions, halt) Protected data (page tables, interrupt vector)
Want to build two abstractions on these Identity (authentication)
Who are you?
Security policy (authorization) What are you allowed to do?
Building up from the hardware
One of the themes of this class Threads
Atomic test&set, interrupt enable/disable Transactions
Single disk sector write Reliable communication
Atomic packet send
Building up from the hardware
Already built on top of HW security primitives
Secure sharing of physical memory Use protection between address spaces (fault on unauthorized access)
Secure sharing of kernel services Use system calls to safely transition to
privileged mode (trap after syscall instruction)
Authentication: who are you?
Prove your identity to the OS Many ways to authenticate
1. Passwords2. Physical tokens3. Biometrics
Password authentication
Password Shared secret between you and the OS
Several weaknesses How should we store passwords?
Cryptographic hashes (MD5, SHA) Hashes are one-way functions Check that hash(input) =
hash(password)
Cryptographic hashes
Infeasible to reverse (even for the system!) I.e. cannot compute plaintext from digest
Collision-resistant hash function Probability (collision) < probability (HW error)
Extremely useful tool in lots of domains Can be used as a shorthand for naming objects (e.g. files)
SHA1 hashSHA1 hash
160 bitsArbitrarily large
“Hash digest”
Password authentication
What’s the weakest link in password systems? People!
People choose short passwords (brute force attacks take less time)
People choose easy-to-remember passwords (narrows the search space: dictionary attack)
People choose the same password for everything (break the weakest web site, gain access to the strongest) How could you use cryptographic hashes to solve this? User remembers one password, systems sends Hash(pass + site
name)
Weak passwords Consider an 8-character password 256^8 possible passwords
2^64 ~ 16 * billion * billion
If you only choose lower-case letters 26^8 ~ 200 billion
If you choose a word from the dictionary 320,000 words in Webster’s unabridged 1000/second find a password in 5 minutes
Researchers ran an attack on Michigan passwords Recovered thousands; some of the most popular: “beer”,
“hockey” Similar result at Berkeley CS in the 80s: “gandolph”
Physical token authentication
Require something physical E.g. a ticket to a dance performance
What if your token is stolen/forged? Require a physical token + password E.g. your ATM card + PIN
Biometric authentication
Essentially a physical token One that should be hard to steal/forge
Examples Thumbprint, retina scan, signature
Most have accuracy problems False positives (accept invalid person) False negatives (reject valid person)
Real-world authentication
How to authenticate to your credit card company? Give them your social security number They ask over the phone and check
How is this vulnerable? Company has my SSN, so they can pretend to be
me (or someone who breaks into them can pretend
to be me) Phone line snoop can pretend to be me
Authorization: what can you do?
Fundamental structure Access control matrix Rows = authentication domains Columns = objects
Two approaches to storing data: Access control lists and capabilities
File 1 File 2 File 3
User 1 RW RW RW
User 2 -- R RW
Access control lists
Standard for file systems At each object
Store who can access the object Store how the object can be accessed
On each access Check that the user has proper permissions
Use groups to make things more convenient E.g. forbes, chase, and lpcox are all in group
“prof”
Access control lists
How can you defeat ACL authorization? Villain convinces the OS it is someone else
Example Sendmail runs as root (full privileges) Attacker compromises sendmail process Can now run arbitrary code under root ID This is what you will be doing in Project 3
What if you get the ACL wrong?
Frighteningly common
“Usability and privacy: a study of Kazaa…” Good and Krekelberg, SigCHI 2003 In 12 hours, found 150 inboxes on Kazaa Observed people downloading them
Many other examples: web, NFS, etc People don’t understand software
interactions Affects even diligent users …
Data breach
Alice Bob
Despite her best efforts, Alice’s data is only as
secure as her least competent confidant.
Snoop
What to do?
Two goals1. Identify sensitive files2. Infer who is allowed to view them
Many more constraints1. Don’t impact performance2. Don’t bother the user3. Remain backwards-compatible
Identifying sensitive files
Approach: identify the common case I download sensitive docs from many sources
mail.cs.duke.edu
What do the examples have in common?Encryption!
Identifying sensitive files Approach: identify the common case
I download sensitive docs from many sources
Insight: data is often encrypted for a reason In our examples, servers allow clients to cache sensitive file To protect those files in the network, communication is encrypted
How can we tell if network communication is encrypted? Use the port (e.g., 443 is used for HTTPS) Ciphertext exhibits high “entropy” or randomness Entropy = information density, measured in bits/byte
Can inexpensively measure the entropy of every socket If data stream exhibits high entropy, see if “resulting data” hits FS If so, tag the file as sensitive
Download entropy scores
How can we distinguish compression from encryption? if compressed format, use data-similarity if actual compression, measure relative data volumes
Capabilities
At each user Store a list of objects they may access Store how those objects may be accessed
Example At user2, store ‹file2, r: file3, rw›
On each access User presents capability Capability proves they are authorized
Capabilities
Possession of capability provides authorization Like car keys If you have the door key, you can get in the car If you have the ignition key, you can drive it
How can you defeat a capability system? Forge an authorized user’s capabilities Like making a copy of someone’s car keys
Capabilities
Solution to forged capabilities? “Self-authenticating” capabilities E.g. include {user, filename}system-key
Mastercard number, expiration date Have these, you can charge to account
What makes these hard to forge? Random numbers in a sparse space
Revocation
How to revoke permissions with an ACL? Just delete/change the ACL entry
How to revoke permissions with capabilities? Not nearly as easy Could change the car door’s lock This revokes access for everyone with the
key
Dealing with security
Principle of least privilege Users get minimum privilege to do their work
Defense in depth Create multiple levels Lower-levels monitor higher ones
Minimize trusted computing base Every system has a trusted part (e.g. the thing that checks the ACL) This part has special privileges It cannot be subject to the security mechanism
Reducing the trusted base
Why reduce the size of the trusted base? Small and simple = easier to reason about Hopefully fewer bugs and security loopholes
Most systems Kernel is trusted Special user (root, Administrator) is trusted
Emerging systems Virtual machines: trust the hypervisor Trusted Platform Modules: hardware attests to
software
Common attacks
Hidden channels Buffer overflow Trojan horse
Hidden channels
Get system to communicate in unintended ways Example: tenex (supposedly secure OS)
Created a team to break in Team had all passwords within 48 hours … oops.
Goal: require 256^8 tries to see if password is right
Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } }
Hidden channels: tenex
How to break? (hint: vm faults are visible) Specially arrange the input’s layout in memory Force a page fault if second character is read If you get a fault, the first character was right Do again for third, fourth, … eighth character
Can check the password in 256*8 tries
Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } }
Buffer overflow
Suppose you have an on-stack buffer You read input into the buffer You don’t check the length of the input
A villain’s long input can corrupt the stack If they’re smart, they can corrupt the return value Allows villain to jump to their own code
First Internet worm did this Creator was a Cornell grad student, Robert Morris Son of a famous computer security guru He is now a professor at MIT
Trojan horse
Basic idea Give somebody something useful Make it do something evil
Examples Replace login with something that emails
passwords Send a Word document with a malicious macro Download Kazaa and install spyware Install a Sony CD and have it install a buggy DRM
root-kit
Why security is so hard
Ken Thompson’s self-replicating code
Goal: put a backdoor into login Allow “ken” to login as root w/o
password
1.Make it possible (easy)2.Hide it (tricky)
Login backdoor Step 1: modify login.c
(code A) if (name == “ken”) login as root This is obvious so how do we hide it?
Step 2: modify C compiler (code B) if (compiling login.c) compile A into binary Removes code A from login.c, keeps backdoor This is now obvious in the compiler, how do we hide it?
Step 3: distribute a buggy C compiler binary (code C) if (compiling C compiler) compile code B into binary No trace of attack in any source code (only in C compiler’s binary)
Upshot: everything is bootstrapped, everything trusts something
Conficker
Conficker and DNS
Problem: how to update your malware? Why not just host at a web site, have malware
check? Easy to catch Monitor malware, see which web sites they connect to Follow the paper trail from that site
Instead, each malware client computes random DNS names
The name of each generated domain is 4 to 10 characters, to which a randomly selected TLD is appended from the following list of 116 suffix (mapping to 110 TLDs):
[ "ac" , "ae" , "ag" , "am" , "as" , "at" , "be" , "bo" , "bz" , "ca" , "cd" , "ch" , "cl" , "cn" , "co.cr" , "co.id" , "co.il" , "co.ke" , "co.kr" , "co.nz" , "co.ug" , "co.uk" , "co.vi" , "co.za" , "com.ag" , "com.ai" , "com.ar" , "com.bo" , "com.br" , "com.bs" , "com.co" , "com.do" , "com.fj" , "com.gh" , "com.gl" , "com.gt" , "com.hn" , "com.jm" , "com.ki" , "com.lc" , "com.mt" , "com.mx" , "com.ng" , "com.ni" , "com.pa" , "com.pe" , "com.pr" , "com.pt" , "com.py" , "com.sv" , "com.tr" , "com.tt" , "com.tw" , "com.ua" , "com.uy" , "com.ve" , "cx" , "cz" , "dj" , "dk" , "dm" , "ec" , "es" , "fm" , "fr" , "gd" , "gr" , "gs" , "gy" , "hk" , "hn" , "ht" , "hu" , "ie" , "im" , "in" , "ir" , "is" , "kn" , "kz" , "la" , "lc" , "li" , "lu" , "lv" , "ly" , "md" , "me" , "mn" , "ms" , "mu" , "mw" , "my" , "nf" , "nl" , "no" , "pe" , "pk" , "pl" , "ps" , "ro" , "ru" , "sc" , "sg" , "sh" , "sk" , "su" , "tc" , "tj" , "tl" , "tn" , "to" , "tw" , "us" , "vc" , "vn" ]
Conficker and DNS
Checks to see if they can download from site until it hits Why is it doing this? What aspects of DNS does this take advantage of?
Coming up
More security (project 3) Storage and files Google