Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary...

123
Exploiting Unicode-enabled Software

Transcript of Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary...

Page 1: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

Exploiting Unicode-enabled Software

CanSecWestMarch 2009

Chris Weberwwwlookoutnet

chriscasabasecuritycomCasaba Security

Exploiting Unicode-enabled Software

March 2009 copy 2009 Chris Weber

bull People for the Ethical Treatment of ASCII

ndash ldquoNo ASCII characters were harmed in the making of this presentationrdquo

wwwcasabasecuritycom

PETA Certified Presentation

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection API

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

Unicode Crash Course

199119901985

1981

198119641963

bull Unicode

bull ISO 10646 (UCS)

bull ISO-8859-1

bull More code pages galore

bull MBCSbull GB2312

bull CP437

bull EBCDIC

bull ASCII 7-bitbull 8th bit free-for-all to follow

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 2: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

CanSecWestMarch 2009

Chris Weberwwwlookoutnet

chriscasabasecuritycomCasaba Security

Exploiting Unicode-enabled Software

March 2009 copy 2009 Chris Weber

bull People for the Ethical Treatment of ASCII

ndash ldquoNo ASCII characters were harmed in the making of this presentationrdquo

wwwcasabasecuritycom

PETA Certified Presentation

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection API

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

Unicode Crash Course

199119901985

1981

198119641963

bull Unicode

bull ISO 10646 (UCS)

bull ISO-8859-1

bull More code pages galore

bull MBCSbull GB2312

bull CP437

bull EBCDIC

bull ASCII 7-bitbull 8th bit free-for-all to follow

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 3: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull People for the Ethical Treatment of ASCII

ndash ldquoNo ASCII characters were harmed in the making of this presentationrdquo

wwwcasabasecuritycom

PETA Certified Presentation

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection API

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

Unicode Crash Course

199119901985

1981

198119641963

bull Unicode

bull ISO 10646 (UCS)

bull ISO-8859-1

bull More code pages galore

bull MBCSbull GB2312

bull CP437

bull EBCDIC

bull ASCII 7-bitbull 8th bit free-for-all to follow

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 4: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection API

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

Unicode Crash Course

199119901985

1981

198119641963

bull Unicode

bull ISO 10646 (UCS)

bull ISO-8859-1

bull More code pages galore

bull MBCSbull GB2312

bull CP437

bull EBCDIC

bull ASCII 7-bitbull 8th bit free-for-all to follow

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 5: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

Unicode Crash Course

199119901985

1981

198119641963

bull Unicode

bull ISO 10646 (UCS)

bull ISO-8859-1

bull More code pages galore

bull MBCSbull GB2312

bull CP437

bull EBCDIC

bull ASCII 7-bitbull 8th bit free-for-all to follow

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 6: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

Unicode Crash Course

199119901985

1981

198119641963

bull Unicode

bull ISO 10646 (UCS)

bull ISO-8859-1

bull More code pages galore

bull MBCSbull GB2312

bull CP437

bull EBCDIC

bull ASCII 7-bitbull 8th bit free-for-all to follow

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 7: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 8: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 9: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Source Wikipedia

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 10: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 11: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 12: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 13: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

Unicode Crash Course

Glyph

Encoding

Properties

Code point

Block Script

Plane

A

UTF-8 UTF-16 UTF-32

Hex Uppercase etc

U+0041

Basic Latin Latin

Basic Multilingual Plane(BMP)

wwwcasabasecuritycom

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 14: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 15: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number represented by a hex value

wwwcasabasecuritycom

Unicode Crash CourseCode Points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 16: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 17: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 18: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull The full 21-bit range is not actually available

U+0000 to U+D7FF and

U+E000 to U+10FFF

whatrsquos up with U+D800U+DFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 19: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 20: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUTF-16 Surrogate Pairs

U+101D1

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 21: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 22: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 23: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 24: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareOverview

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 25: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304

wwwcasabasecuritycom

Root CausesVisual Spoofing

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 26: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

httpπαράδειγμαδοκιμή

(exampletest)

wwwcasabasecuritycom

Root CausesIDN ndash Internationalized Domain Names

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 27: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull IDNA 2003

bull Nameprep (NFKC and prohibit)

bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp

bull Whitelist TLDrsquosndash ORG DE CN to name a few

bull Language settings and TLD

bull Character blacklisting

wwwcasabasecuritycom

Root CausesIDN ndash what do the browsers do

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 28: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Divergent browser implementations

bull Confusables exist

bull IDNA and Nameprep based on Unicode 32

ndash Wersquore up to Unicode 51 (larger repertoire)

wwwcasabasecuritycom

Root CausesIDN ndash so whatrsquos the problem

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 29: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 30: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 31: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 32: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 33: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Normalize with NFKC

bull Homograph and Confusables detection

bull Specifications

ndash IDNA Stringprep

bull Guidance

ndash Unicode Consortium ICANN IETF IANA

wwwcasabasecuritycom

Root CausesGuidance for Visual Spoofing

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 34: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

Registries apply the guidance

ndash define the allowed characters per TLD

ndash Collaboration with IANA

Registrars sell the domain names

wwwcasabasecuritycom

Root CausesGuidance for International Domain Names

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 35: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 36: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Registrars still allow

ndash Confusables

ndash Combining marks

ndash Single Whole and Mixed-script

bull Registrars canrsquot control

ndash Syntax spoofing in sub domain labels

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 37: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 38: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 39: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 40: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 41: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

The Confusables

ndash Single script

ndash Mixed script

ndash Whole script

wwwcasabasecuritycom

Attack VectorsDefining Homographs

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 42: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

wwwɑpplecom User thinks lsquoarsquo

Really itrsquos Latin small letter Alpha lsquoɑrsquo

wwwlooĸoutnet

User thinks lsquokrsquo

Really itrsquos Latin letter kra lsquoĸrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 43: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

wwwg๐๐glecom User thinks lsquoorsquo

Really itrsquos Thai digit zero lsquo๐rsquo

wwwfaϲebookcom

User thinks lsquocrsquo

Really itrsquos Greek lunate sigma symbol lsquocrsquo

wwwᏀooglecom

Really itrsquos Cherokee letter Nah lsquoᏀrsquo

wwwcasabasecuritycom

Attack VectorsMixed-script and The Confusables

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 44: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

wwwаЬсcom

User thinks lsquoabcrsquo

Really itrsquos Cyrillic script

wwwігѕgov

User thinks lsquoirsrsquo

Really itrsquos Greek script

wwwcasabasecuritycom

Attack VectorsWhole-script and The Confusables

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 45: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 46: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 47: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull ORG is whitelisted

ndash Limited characters available

bull To unscrutinizing eyes

iacute looks like i

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 48: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 49: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 50: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 51: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

U+2571 Box Drawings

〳 U+3033 Kana Repeat Mark

Ꜹ U+A738 LATIN CAPITAL AV

ꜹ U+A739 LATIN SMALL AV

U+FF65 KATAKANA MIDDLE DOT

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with and lookalikes

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 52: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes

httpwwwgooglecom

Katakana DotU+FF65

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 53: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 54: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

Browser sees and displays a valid IDN

DNS sees Punycode

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 55: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

IDN Visual Spoofing

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 56: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Visual Spoofing Detection API

ndash Detects Confusables

ndash Detects Invisibles

ndash Detections syntax and punctuation lookalikes

ndash Detects combining mark tricks

bull Currently in testing

bull Release planned for Fall 2009

wwwcasabasecuritycom

IDN Visual SpoofingSolutions and Defenses (yes there is one)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 57: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

U+200B (ZERO WIDTH SPACE)

U+180E (MONGOLIAN VOWEL SEPARATOR)

U+FEFF (ZERO WIDTH NO-BREAK SPACE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 58: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 59: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

bull Fonts render glyphs confusingly

bull Fonts render glyphs as empty white space

httpwwwgooglecom phreedomorg

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 60: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing Problematic Font-rendering

middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)

A is A (Lucida Sans Unicode Courier New)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 61: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Multiple combining marks

o looks like U+006F U+0304

o is U+006F U+0304 U+0304

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 62: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Combining Marks

bull Order of combining marksndash ȏ and ouml under NFKC

ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt

ltU+006F U+0311U+0308gt ltU+020F U+0308gt

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 63: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 64: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

bull httpunicodeorgreportstr9

ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo

ndash forbidden in IDNA

U+202D (LEFT-TO-RIGHT OVERRIDE)

U+202E (RIGHT-TO-LEFT OVERRIDE)

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 65: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 66: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 67: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Net runtime will marshall a string as LPStr to a pinvoke function

How can we best-fit the lt character

bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket

How can we best-fit the s character

bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex

To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]

wwwcasabasecuritycom

Windows best-fit pInvokeBest-fit mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 68: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 69: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 70: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 71: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 72: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

NFD - Decompose (canonical)

NFC - Decompose (canonical) Recompose

NFKD - Decompose (compatibility)

NFKC - Decompose (compatibility) Recompose

wwwcasabasecuritycom

Root CausesNormalization

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 73: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 74: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 75: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 76: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 77: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 78: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 79: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 80: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 81: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Directory traversal test casesndash httpsiterootsystem

ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem

ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem

ndash Division Slash U+2215 best-fithttpsiteroot E28895system

ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system

wwwcasabasecuritycom

Attack Vectors

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 82: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 83: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 84: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 85: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 86: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 87: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 88: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 89: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 90: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 91: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Safari BOM injection for XSS

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 92: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 93: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 94: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 95: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 96: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 97: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 98: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 99: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 100: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 101: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 102: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 103: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 104: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 105: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

DEMO

wwwcasabasecuritycom

Opera White Space Characters

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 106: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 107: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 108: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 109: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 110: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 111: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 112: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 113: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 114: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 115: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Exploiting Unicode-enabled SoftwareAgenda

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 116: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Watcher

ndash Web-app security testing and auditing

bull Visual Spoofing Detection API

ndash Providing guarantees against Visual Spoofing and Homograph attacks

wwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 117: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 118: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 119: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 120: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Problemndash Unicode enables visual-spoofing-maximus

bull Solutionndash Confusable detection

ndash Invisibles detection

ndash Syntax spoof detection

ndash more

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 121: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weber

bull Cross-platform component library written in C

bull Can be applied in user-agents or any softwarendash Browsers

ndash Email clients

bull Planned for release Fall 2009

bull Email me with questions

wwwcasabasecuritycom

ToolsVisual Spoofing Detection API

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 122: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

March 2009 copy 2009 Chris Weberwwwcasabasecuritycom

ToolsVisual Spoofing Protection Demo

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom

Page 123: Exploiting Unicode-enabled software - CanSecWest encodings categorization normalization binary properties case mapping conversion tables bi-directionalproperties canonical mappings

Thank you

Contact me with questions new test cases or ideas to share

Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API

Chris WeberwwwlookoutnetCasaba Security

wwwcasabasecuritycom