Impact of scrambling on barcode entropy

15
IS&T NIP26 Conference, 23 September, 2010 Impact of Scrambling on Barcode Entropy Marie Vans, Steven Simske, Margaret Sturgill, & Jason Aronoff HP Laboratories, Fort Collins, CO, USA 23 September 2010

Transcript of Impact of scrambling on barcode entropy

Page 1: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 2010

Impact of Scrambling on Barcode Entropy

Marie Vans, Steven Simske, Margaret Sturgill, & Jason AronoffHP Laboratories, Fort Collins, CO, USA23 September 2010

Page 2: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20102

Outline

– Introduction– Entropy Measures– Scrambling Techniques– Tests– Results– Conclusions

Page 3: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20103

Introduction– Barcodes not just for ringing up sales anymore:

• Connecting to websites• Consumer capture of content

– 1D vs. 2D/3D Barcodes• Older 1D barcode standards being replaced and/or augmented with 2D or 3D barcodes

• High-density barcodes used for additional data carrying or referencing

– ECC• Added for robustness to certain types of distortion and damage• Nature of ECC derived from assumptions more relevant to 1D barcodes/general information theory.

• Use of ECC can be questioned• Opens door to using barcodes as information carriers outside of the current barcode standards.

– Previous Work• effect of the print-scan (PS) cycle, or “copying” cycle• localized damage such as water damage and/or puncturing• blurring S.J. Simske, M. Sturgill, and J.S. Aronoff, “Effect of

Copying and Restoration on Color Barcode Payload Density,” Proc. ACM DocEng, vol. 9, pp. 127-130,

2009.

Page 4: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20104

Some Background

– An attempt to highlight differential effects of encryption methods on entropy by applying scrambling techniques to randomly generated strings with and without Error Correcting Codes (ECC)

– Major Pieces:• Entropy

−Increasing entropy reduces the likelihood of a fraudulent agent being able to “guess” correct barcodes

• Scrambling−Four ways to mix-up the barcode data

• ECC−Reed-Solomon Error Correcting Codes

Page 5: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20105

Entropy MeasuresEntropy as a measure for the effect of ECC and scrambling on 2D barcodes. Here, entropy represents signal randomness: how the bits are distributed in a signal.

N

i

XEXE

xXEXEe

11 )(*

)()()(

log

Expected Values

0

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6 7 8 9 10 11 12Run Lengths

Expe

cted

Expected Values

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Max Entropy Low Entropy Minimum Entropy

Entropy

Normalized Entropy

Page 6: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20106

Entropy Measures - continuedEntropy based on Hamming Distance. N refers to the maximum Hamming Distance (HD) between two bytes and x refers to the normalized i HD of the actual strings. This HD is calculated on a moving window along a string in a forward direction.

1

0.1*0.1

0.11log

12

2

N

x

e

N

i

Equation 2 - Hamming Distance Entropy

0

0.5

1

1.5

2

2.5

Max Entropy Low Entropy Minimum Entropy

Entropy

Hamming Distance (HD)

Entropy

Page 7: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20107

Scrambling TechniquesXOR:

•A randomly generated string of same size as entire string (message + ECC bits) and XOR’d with input string.

Structural scramble: •Divide string matrix into equal sized structures (squares, rectangles, etc.). Swap bits within each structure so new structure is a mirror image of the original.

Even Check Bits: •Add check bit at end of each row & column so that total number of black modules is even.

Odd Check Bits: •Add check bit at end of each row & column so that total number of black modules is odd.

Page 8: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20108

Hypothesis– “Challenging” entropy of string set with another random string :

• Should result in different responses if string not as entropic as challenge string

– When random number is challenged, should be no difference in the entropy between the two randomly generated strings.

– If string contains ECC, could be detectable difference in entropy between string with ECC and randomly generated challenge string.

Random Signal

Random SignalECCRandom Signal

Random Signal

Challenge

A B

Page 9: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 20109

Experimental Set-up•28,000 individual barcodes generated using:

• 500 randomly generated strings

• Average length - 310 bits

• Symbol sizes of 12x12 up to 26x26

• Module sizes from 12 to 18 pixels

•Each test has an associated scrambling algorithm and entropy measure.

•Each test run twice

• Using maximum number of ECC bits allowable for size

• Using randomly generated data where the ECC bits would normally be inserted.

• A total of 672,000 barcodes were tested with half containing ECC bits and half completely random without ECC.

Page 10: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 201010

Results– Result is the percent change of entropy between the input and output strings

• Mean output/mean input• E.g. A result near 1.0 means there was very little change

– Non ECC change was very small• scrambling a fully random string should result in another random string

– ECC entropy change increased• scrambling a string containing non-random bits should result in a more random string

– Measured using Normalized Entropy with all scrambling techniques

Normalized Entropy - ECC vs NonECC

0.88

0.9

0.92

0.94

0.96

0.98

1

1.02

1.04

12 x 12 14 x 14 16 x 16 18 x 18 20 x 20 22 x 22 24 x 24 26 x 26

Symbol Size

Ave

rage

% C

hg O

ut/In

ECCNonECC

Page 11: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 201011

Results– Population statistics – Normalized Entropy

• Normalized Entropy (e1) means for ECC and non-ECC using the XOR scrambling algorithm

• No way to distinguish between ECC and non-ECC strings by looking at difference in input or output means only.

Normalized Entropy - Input & Output Means for ECC & NonECC

0.1

0.12

0.14

0.16

0.18

0.2

0.22

0.24

0.26

0.28

12 x 12 14 x 14 16 x 16 18 x 18 20 x 20 22 x 22 24 x 24 26 x 26

Symbol Size

Mea

n En

trop

y

Mean Input Entropy - ECC

Mean Input Entropy - NoECC

Mean Output Entropy - ECC

Mean Output Entropy - NoECC

Page 12: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 201012

Results– Change in entropy after

scrambling results in higher entropy (less randomness) for both the ECC and the non-ECC strings.

– For most symbol sizes, e2 output values are lower than input values

– ECC strings start out with more structure than the non-ECC string and become more random after scrambling.

– Change in entropy after scrambling non-ECC strings is detectable

HamDistWind Entropy Measure - ECC vs. NonECC

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

1.02

12 x 12 14 x 14 16 x 16 18 x 18 20 x 20 22 x 22 24 x 24 26 x 26

Symbol Size

Ave

rage

% C

hg O

ut/In

Ent

ropy

Mea

sure

HamDistWind w ith ECC

HamDistWind NoECC

XOR-HamDistwind

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

12x12 14x14 16x16 18x18 20x20 22x22 24x24 26x26

Symbol Size

Ave

rage

% C

hang

e O

ut/In

ECC Data

NonECCData

Page 13: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 201013

Results– Example shows standard error for

output means using the XOR scrambling algorithm• Other scrambling algorithms show similar results

– Half the error bar shown to show the magnitude

– The two populations overlap and cannot be distinguished with any reasonable level of statistical confidence

– Population statistics show that detecting difference between ECC and non-ECC signals using population means is not easy using these methods

HanDistWind Entropy -- StdErr Output ECC vs. NonECC

0

0.005

0.01

0.015

0.02

0.025

0.03

12 x12

14 x14

16 x16

18 x18

20 x20

22 x22

24 x24

26 x26

Symbol Size

StdE

rr O

utpu

t - H

amD

istW

ind

ECCData

Non-ECCData

XOR--HamDistWind Entropy -- Output Mean ECC vs. NonECC

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0 2 4 6 8 10

Symbol Size

Mea

n In

put -

Ham

Dis

tWin

d

ECC Data

Non-ECCData

Figure 13: XOR Scrambling - Output Mean

Page 14: Impact of scrambling on barcode entropy

IS&T NIP26 Conference, 23 September, 201014

Conclusions– Three entropy-based methods for determining the degree of

randomness in a signal – Affect of scrambling on the outcome of these methods– Data Matrix standard does not take this type of security into

account• ECC within the signal has structure and is therefore vulnerable to attacks

– Our entropy measures and the appropriate “attack” can detect the difference between a truly random signal and a signal that contains structure

– Uses:• Discover if ECC has been used & potential vulnerabilities of the security data• Methods can be implemented to determine whether data is encrypted• Possible to interrogate the entropy of the comprised signal and compare it to the original entropy values.

Page 15: Impact of scrambling on barcode entropy

15 IS&T NIP26 Conference, 23 September, 2010

Q&A