On the Fingerprinting Capacity Games for Arbitrary ...moulin/Papers/TIFS_FP-0704.pdf · On the...

1

On the Fingerprinting Capacity Games for ArbitraryAlphabets and Their Asymptotics

Yen-Wei Huang and Pierre Moulin, Fellow, IEEE

Abstract—The fingerprinting capacity has recently been de-rived as the value of a two-person zero-sum game. In this work,we study the fingerprinting capacity games with k pirates ina new collusion model called the mixed digit model, which isinspired by the combined digit model of Skoric et al. For smallk, the capacities along with optimal strategies for both playersof the game are obtained explicitly. For large k, we extendour earlier asymptotic analysis for the binary alphabet withthe marking assumption to q-ary alphabets with this generalmodel and show that the capacity is asymptotic to A/(2k2 ln q)where the constant A is specified as the maximin value of afunctional game. Saddle-point solutions to the game are obtainedusing methods of variational calculus. For the special case of q-ary fingerprinting in the restricted digit model, we show that theinterleaving attack is asymptotically optimal, a property that hasmotivated the design of optimized practical codes.

Index Terms—Fingerprinting, traitor tracing, collusion attacks,capacity, game theory, minimax analysis, asymptotic analysis.

I. INTRODUCTION

Digital fingerprinting has found applications to digital rightsmanagement (e.g., copyright protection for movies), documentprotection, and traitor tracing. Before distribution of a copy-righted content to multiple recipients, a fingerprinting systemembeds a unique identifier (fingerprint) into each copy, whichcan be extracted and helps trace unauthorized redistribution. Acollusion attack is a powerful attack by a group of malicioususers called pirates (or traitors, or colluders) who combinetheir copies to generate a new version that contains only weaktraces of their fingerprints. The goal of a fingerprinting decoderis to identify the pirates even under severe collusion attacks.

The challenges of designing an effective fingerprintingsystem are manyfold. First, it is difficult to construct a suitablemodel that describes the colluders’ activities for differentapplications. Second, information-theoretic analysis has shownthat in some models, the maximum achievable rate, or thecapacity (We will formally define the rate and the capacityin Sec. II-C. Basically the higher the rate or the capacityis, the shorter or the more effective the codes are.), decaysquadratically with the size of the coalition k [1]. Finally,recent constructions of capacity-achieving fingerprinting codesrequire exponential computational complexity [2], [3] whileefficient designs fall short in achieving the capacity [4], [3],

This work was first presented at ISIT 2012.Y.-W. Huang was with the Department of Electrical and Computer Engi-

neering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA.He is now with Microsoft Corporation, Redmond, WA 98052 USA (e-mail:[email protected]).

P. Moulin is with the Department of Electrical and Computer Engineering,University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (e-mail:[email protected]).

[5] or only achieve the capacity in specific models [6], [7],[8], [9].

In [10], we studied the fingerprinting capacity games forbinary alphabets under the so-called Boneh-Shaw markingassumption [11], which assumes that pirates may only “cut-and-paste” their copies. This assumption however ignoresthe possibility that pirates might perform signal processingattacks which occasionally remove or replace a fingerprintsymbol. The combined digit model (CDM) recently proposedby Skoric et al. [12] is a significant generalization of theBoneh-Shaw marking assumption which successfully accountsfor both types of attacks. However, analyzing the fingerprintingcapacity games in the CDM is very difficult since it allowsthe fingerprint detector to “see” multiple symbols [13]. Inattempting to bring the theory of fingerprinting one stepcloser to practical scenarios, we propose a new model calledthe mixed digit model (MDM) and study the fingerprintingcapacity games in the MDM in this work.

It was shown in [10] that the fingerprinting capacity is thevalue of a two-person zero-sum game which admits a saddle-point solution. This property applies to the MDM and thegame can be solved numerically when the coalition size kis small. In Sec. III of this paper we show that the pirates canperform a stronger attack in the MDM, and the capacity inthe MDM is about 20-30% less than that under the markingassumption. This suggests that the marking assumption isindeed too optimistic in addressing the colluders’ capabilityof manipulating their fingerprint copies. The results also shedlight on the structure of capacity-achieving codes, and onhow the colluders can maximally utilize their copies to avoiddetection. We then analyze the asymptotics of the gamewhen the coalition size k is large, generalizing our earlierasymptotic results for binary alphabets under the markingassumption [10]. Assuming a certain regularity condition,we show that the capacity for q-ary alphabet against size-kcoalition still decays quadratically with k in the MDM and theconstant in front is the value of a q-dimensional continuous-kernel game. Using methods of variational calculus, a set ofEuler-Lagrange differential equations, Lagrange multipliers,and boundary conditions can be specified as necessary andsufficient conditions. We derive the following results:

1) Binary alphabet with the marking assumption: Weprovide a simpler proof for the results in [10]. The asymp-totic capacity is (k22 ln 2)−1, the optimal embeddingdistribution is the arcsine distribution, and the optimalattack is the interleaving attack.

2) Binary alphabet with the MDM: The asymptotic ca-pacity is strictly smaller than (k22 ln 2)−1 and the gap

2

depends on the parameters of the MDM. Somewhatsurprisingly the optimal embedding is still the arcsinedistribution, but the optimal attack is no longer theinterleaving attack.

3) Arbitrary alphabet with the marking assumption: Theasymptotic capacity is (q − 1)/(2k2 ln q) (which wasfirst shown in [14]), the optimal embedding distributionbelongs to the family of Dirichlet distributions, and theoptimal attack is the interleaving attack.

The paper is structured as follows. Sec. II introduces thefingerprinting model and briefly reviews the results in [3],[10]. We show analytical and numerical solutions to the small-coalition fingerprinting capacity games in Sec. III, and thoseto large-coalition games in Sec. IV.

Notation. We use capital letters to represent randomvariables, and lowercase letters to represent their realizations.Boldface letters denote vectors and matrices, and calligraphicletters denote sets. For example, X ∈ Xn denotes a randomvector (X1, . . . , Xn), with each Xi taking values in X . Thetranspose of a matrix A is denoted by A′. The base-q entropyof a discrete random variable X ∈ X with distribution pXis denoted by H(X) , −

∑x pX(x) logq pX(x) and the

conditional entropy of X given Y is denoted by H(X|Y ).The base-q mutual information of X and Y is denoted byI(X;Y ) = H(X)−H(X|Y ). The base-q conditional mutualinformation between X and Y given Z is I(X;Y |Z) =∑z pZ(z)I(X;Y |Z = z). The base-q Kullback-Leibler di-

vergence between two probability vectors p and r is denotedby Dq(p ‖ r) ,

∑i pi logq

piri

. The indicator function δabtakes value one if a = b and zero otherwise. The p-normof a vector v is denoted by ‖v‖p , (

∑i |vi|

p)1/p and

the vector inequality a ≥ b denotes element-wise inequal-ity. The shorthand f ∼ g denotes the asymptotic equalitylimk→∞[f(k)/g(k)] = 1.

II. FINGERPRINTING CAPACITY GAMES

A. Fingerprinting System

Let X = Q = {0, 1, . . . , q − 1} denote a size-q alphabet,M = {1, . . . ,M} the user index set, and Wn a set of secretkey values. An (n,M) randomized fingerprinting code C overX , or fingerprinting ensemble, consists of an encoder-decoderpair (en, dn). The encoder

en :M×Wn → Xn

assigns user m a length-n fingerprint Xm = en(m,Wn)where Wn is the random key. The mapping en is known tothe public but the realization of Wn is kept secret from thepirates.

A coalition is formed by a subset of k users who are alsocalled colluders. Without loss of generality, we denote the setof colluder indices by K = {1, . . . , k}. The colluders observefingerprint sequences XK = {X1, . . . ,Xk} and use them toproduce a forgery Y ∈ Yn. The process is modeled by passingXK through a collusion channel pY|XK , which is a conditionalprobability distribution on Y given XK.

Encoder en Choose kCollusionChannelpY|XK

Decoder dn

Secret key({Xm}Mm=1 ,W

n)

User indicesFingerprinted

copiesCoalition

copiesForgery Accused

user indices

12

M

K

X1

X2

XM

X1

X2

Xk

Y

···

···

···

Fig. 1. Fingerprinting system model: encoder, collusion channel, and decoder.

Based upon the received forgery Y, all fingerprint se-quences {Xm}Mm=1, and the secret key Wn, the decoder

dn : Yn ×Xn×M ×Wn → 2M

K = dn

(Y, {Xm}Mm=1 ,W

n)

accuses a subset of users K. Two error criteria are often used:Under the detect-all criterion, an error occurs when K is notequal to K, and under the detect-one criterion, an error occurswhen K is empty or contains some innocent user m /∈ K.Our complete model for the fingerprinting system is shown inFig. 1.

B. Collusion Channel

The set of admissible collusion channels determines thecoalition’s ability to manipulate the fingerprint sequencesavailable to them. The Boneh-Shaw marking assumption andits nonbinary alphabet variants [11], [15] are the most com-monly adopted models in the literature. Unfortunately thesemodels fail to completely capture plausible colluders’ actions.Skoric et al. [12] recently proposed a more general setup calledthe combined digit model. However, the CDM combines thecollusion and the decoding process in a way that is very hardto analyze. Here we introduce the mixed digit model whichmakes analysis possible (especially for asymptotic analysis inSec. IV) but retains the spirit of the CDM. We show how allvariants of the marking assumption fit into our new model andhow parameters can be determined based on applications.

We consider a memoryless collusion channel pY|XK , i.e.,

pY|XK(y|xK) =

n∏i=1

pY |XK(yi|xK,i),

which implies that for each coordinate i the forgery symbolyi depends only on x1,i, . . . , xk,i. Therefore the collusion canbe characterized by the single-lettered channel pY |XK and wewill drop the coordinate index i in the following analysis. Wealso assume the channel to be user-symmetric, which meansthat the forgery depends only on the sequences available tothe coalition, and not on which pirates receive them.1 Definethe count vector Z = [Z0, Z1, . . . , Zq−1]′ ∈ {0, . . . , k}q with

Zx =

k∑m=1

δXmx ∈ {0, . . . , k}, x ∈ X

1The memoryless and user-symmetric constraints may seem restrictive, butas studied in [3] and [10], they can be relaxed without affecting the capacity.

3

which indicates how many times the k colluders see symbol xat a given coordinate. Let Z , {z ∈ {0, . . . , k}q :

∑q−1x=0 zx =

k}. Let Y = Q∪ {e} where ‘e’ denotes an erasure. Based onthe user-symmetry assumption, Z is a sufficient statistic forgenerating Y , and the colluders’ strategy can be modeled bythe (q + 1)× |Z| matrix Θ , {θ(z)}z∈Z where

θy(z) , pY |Z(y|z), y ∈ Y, z ∈ Z (1)

is the conditional probability of selecting forgery symbol ygiven the count vector z. The MDM admits two parametervectors: ru = [ru,1, . . . , ru,q−1]′ and re = [re,1, . . . , re,q]

′. Thevalue ru,N specifies an upper bound on the probability ofpirates generating an unseen symbol given they have seen Ndistinct symbols, and the value re,N specifies an upper boundon the probability of pirates generating the erasure given theyhave seen N distinct symbols. Mathematically, a collusionchannel Θ is admissible if and only if it satisfies the linearconstraints ∑

y∈Q,y /∈XK

θy(z) ≤ ru,N(z) (2a)

θe(z) ≤ re,N(z) (2b)

for each z ∈ Z , where N(z) ∈ {0, . . . , k} denotes the numberof distinct symbols seen by the pirates.

In the CDM pirates are allowed to perform “signal pro-cessing attacks” so the decoder can “see” none or multiplesymbols, while in the MDM pirates are allowed to use onlyone of the symbols or the erasure. In the following scenariosthey are analogous:

1) The decoder not “seeing” any symbol in the CDM isanalogous to pirates using the erasure in the MDM.

2) The decoder “seeing” one of the symbols pirates have inthe CDM is analogous to pirates using that symbol in theMDM.

3) The decoder “seeing” any number of the symbols piratesdon’t have in the CDM is analogous to pirates using anunseen symbol in the MDM.

The scenario with the decoder “seeing” multiple symbols inthe CDM, however, does not have an analogous scenario inthe MDM. Nonetheless, by the common “the enemy knowsthe system” principle in cryptography, we should assume thatpirates know the decoder and thus can avoid this scenariofrom ever happening, i.e., they can perform a stronger attackby using one of the symbols instead. Therefore we claim thatthe MDM does not strengthen or weaken the pirates’ abilityto manipulate the fingerprint sequences in comparison to theCDM. Similar to the CDM, the MDM is a fairly general modelas it reduces to the following four variants of the markingassumption in the literature [15], [12], [13] with differentchoices of ru and re:• Restricted digit model (RDM) only allows the colluders

to “mix and match” their copies of the content, i.e., Ycan only be one of the symbols in XK.

• Unreadable digit model (UDM) allows slightly strongerattacks. Besides the symbols they have, the colluders canalso generate an erasure when N(z) > 1.

TABLE IMDM PARAMETERS FOR VARIANTS OF THE MARKING ASSUMPTION

Model Forgery Parametersalphabet r′u r′e

RDM Q [0, 0, · · · , 0] [0, 0, · · · , 0]UDM Q∪ {e} [0, 0, · · · , 0] [0, 1, · · · , 1]ADM Q [0, 1, · · · , 1] [0, 0, · · · , 0]GDM Q∪ {e} [0, 1, · · · , 1] [0, 1, · · · , 1]

MDM1

Q∪ {e}[.01, .01, · · · , .01] [.20, .65, .84, .92, .92, · · · ]

MDM2 [.05, .05, · · · , .05] [.06, .40, .57, .72, .75, · · · ]MDM3 [.10, .10, · · · , .10] [.03, .25, .44, .54, .63, · · · ]

Note: The MDM is not the same as the CDM [12], but the tradeoffbetween ru and re is similar to that of r and ψ in [12].

• Arbitrary digit model (ADM) does not allow erasures,but Y can be any symbol in Q when N(z) > 1. In Sec.III we show that the ADM allows stronger attacks thanthe UDM.

• General digit model (GDM) allows Y to be any symbolin Q or an erasure when N(z) > 1. This allows thestrongest attacks.

Table I summarizes the parameters corresponding to differ-ent models. It also shows realistic values of parameters basedon simulations by Skoric et al. [12]. Collusion classes, namelythe class of feasible Θ, are denoted by PRDM, PUDM, etc. forspecific models, and by Pc in general.

C. Fingerprinting Capacity

Given a fingerprinting code C and a class Pc of collusionchannels, the worst-case error probabilities under the detect-all and the detect-one criteria for a size-k coalition are definedby2

Pe,k(C,Pc) = maxK⊆M|K|≤k

supΘ∈Pc

Pr(E) (3)

where the error event E is given by Eall ={K 6= K

}and

Eone ={K \ K 6= ∅

}∪{K ∩ K = ∅

}for P all

e,k(C,Pc) andP onee,k (C,Pc) respectively. The rate of a code C is defined byn−1 logqM , and a rate R is deemed achievable if there exists asequence of rate-R fingerprinting codes with vanishing worst-case error probability. The fingerprinting capacities Cone

k,q(Pc)

and Callk,q(Pc) are defined as the suprema of all achievable

rates with respect to the detect-one and detect-all criteria,respectively.

As studied in [3], [10], the following two-phase fingerprint-ing construction and joint/simple decoding scheme approachesor achieves the capacity:

1) Encoding scheme: Let PW be a probability distri-bution on the (q − 1)-dimensional simplex W ,{w ∈ Rq : ‖w‖1 = 1,w ≥ 0}. A sequence of auxiliary“time-sharing” random variables {Wi}ni=1 is drawn in-dependent and identically from the distribution PW. Foreach i ∈ {1, . . . , n}, {Xm,i}Mm=1 are M independent andidentically distributed random variables constructed from

2Maximizer of (3) exists since Pc defined in Sec. II-B is compact.

4

a categorical distribution given parameter Wi = w, i.e.,

Pr (X1,i = x1, . . . , XM,i = xM |Wi = w) =

M∏m=1

wxm

(4)for x1, . . . , xm ∈ X . The class of all probability distribu-tions onW is denoted by PW, and Pe denotes a compactsubset of PW to be specified.

2) Decoding scheme: Two decoding schemes are proposedin [3]. The simple decoder evaluates the so-called em-pirical conditional mutual information I(xm; y|w) foreach user m. A threshold ηsimple is chosen and userm is accused if and only if I(xm; y|w) > ηsimple. IfI(xm; y|w) ≤ ηsimple for all m ∈ M, then K = ∅.The joint decoder evaluates the following score for eachcandidate coalition A ⊆M:

S(A) =

{0, if A = ∅I(xA; y|w)− |A|ηjoint, otherwise

where ηjoint is a threshold. The set A that has the largestscore is then accused. With the parameters ηsimple andηjoint, both decoders allow us to tune the tradeoffs betweenfalse positive and false negative error probabilities.

Define the joint and simple capacities C jointk,q (Pe,Pc) and

Csimplek,q (Pe,Pc) respectively by

C jointk,q (Pe,Pc) = sup

PW∈Peinf

Θ∈Pc

1

kI(Z;Y |W) (5)

and

Csimplek,q (Pe,Pc) = sup

PW∈Peinf

Θ∈PcI(X1;Y |W). (6)

The expressions in (5) and (6) can be unified by

Ck,q(Pe,Pc) = supPW∈Pe

infΘ∈Pc

E [Ik,q(W,Θ)] (7)

if we letI jointk,q (w,Θ) ,

1

kI(Z;Y |W = w) (8)

andIsimplek,q (w,Θ) , I(X1;Y |W = w). (9)

The expectation E is taken with respect to PW. Also ifPe = PW, then we drop the first argument Pe of (5)-(7)for notational simplicity.

Theorem 1: [3], [10](i) Maximizer and minimizer for (7) exist, i.e., we can

replace “sup-inf” of (7) by “max-min.”(ii) With embedding distribution PW and the joint

(resp. simple) decoding scheme, any rate belowC jointk,q ({PW},Pc) (resp. Csimple

k,q ({PW},Pc)) is achiev-able.

(iii) We have

Callk,q(Pc) = Cone

k,q(Pc) = C jointk,q (Pc) ≥ Csimple

k,q (Pc),

i.e., the joint decoding scheme achieves the capacitywhile the simple decoding scheme is generally subop-timal (the inequality is in general strict).

(iv) Both the payoff functions I jointk,q and Isimple

k,q are linear inw and convex in Θ. The maximin game of (7) admitsa saddle-point solution and the maximin value equals itsminimax value, i.e.,

Ck,q(Pc) = minΘ∈Pc

maxw∈W

Ik,q(w,Θ). (10)

The maximizing and minimizing strategies are denotedby P (k,q)

W and Θ(k,q) respectively.It is worth mentioning that although the simple decoding

scheme fails to achieve the capacity, its computational com-plexity is much lower than that of the joint decoding scheme.In Sec. IV we show that the gap between C joint

k,q (Pc) andCsimplek,q (Pc) is actually negligible for large k under a regularity

condition.

III. FIGHTING SMALL COALITIONS

Owing to the existence of the saddle-point solutions, thejoint and simple fingerprinting capacity games can be solvedeither analytically or numerically when k is small. The algo-rithm of [10] for the binary alphabet case can be generalizedto the q-ary alphabet with the combined digit model. Alsobesides the user-symmetry constraint discussed in Sec. II, weimpose symbol-symmetry on both the embedding distributionPW and the collusion channel Θ. This constraint forces bothplayers to treat each symbol equally and greatly simplifiesthe analysis. The optimality of symbol-symmetric strategies isdiscussed in [10].

A. One Pirate

For k = 1, the joint and simple capacity games are thesame since the payoff functions (8) and (9) are the same.The maximin game of (7) reduces to a compound channelcapacity game [16], or to a watermarking game [17]. For thefour variants of the marking assumption, the collusion class istrivial because the single pirate cannot modify her/his copy atall. Hence capacity is one, and that value is achieved in (5) and(6) by the uniform embedding distribution W

(k,q)x ≡ 1/q for

all x ∈ X . For the MDM, the optimal embedding distributionis the same. Upon receiving z = [1, 0, · · · , 0]′, the colluder

utilizes strategy θ(z) =[1− pu − pe,

puq−1 , · · · ,

puq−1 , pe

]′(due

to symbol-symmetry), i.e., he/she generates the erasure withprobability pe and all unseen symbols equally likely witha total crossover probability of pu. The optimal strategy isp

(k,q)e = re,1 and p

(k,q)u = min

{q−1q (1− re,1), ru,1

}. Capac-

ity is given by

C joint1,q (PMDM) = Csimple

1,q (PMDM)

=

{(1− re,1)

[1− hq

(re,1

1− re,1

)]− ru,1 logq(q − 1)

}+

where hq(p) , −p logq p − (1 − p) logq(1 − p) and (x)+ ,max(x, 0). The capacities using parameters in [12, Fig. 2] areshown in Fig. 2. We see that the loss in capacity from themost restrictive case (marking assumption) is about 20-30%.

5

0 0.05 0.1 0.15 0.20.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

re,1

q = 2q = 3q = 4q = 5

Fig. 2. The fingerprinting capacities C1,q(PMDM). Note that the ru,1implicitly decreases as re,1 increases as shown in Table I.

B. Two Pirates

By studying the simplest case involving collusion at-tacks, we can see how different models allow differentstrengths of attacks. By symbol-symmetry there are onlytwo types of patterns in Z , namely z1 = [2, 0, . . . , 0]′

and z2 = [1, 1, 0, . . . , 0]′. Similar to the analysis in III-A,the crossover probabilities pu and the erasure probabilitiespe determine the collusion attack. More specifically, we let

θ(z1) =[1− pu,1 − pe,1,

pu,1q−1 , · · · ,

pu,1q−1 , pe,1

]′and θ(z2) =[

1−pu,2−pe,22 ,

1−pu,2−pe,22 ,

pu,2q−2 , · · · ,

pu,2q−2 , pe,2

]′due to symbol-

symmetry. The optimal collusion channel is then characterizedby p

(k,q)u and p

(k,q)e .

• Restricted Digit Model: In the RDM, pu = pe = 0and the optimal embedding distribution is W (k,q)

x ≡ 1/qfor all x ∈ X for both the joint and simple capacitygames. Substituting them into (5) and (6) and we get thecapacities as

C joint2,q (PW,PRDM) =

1

2

(1− q − 1

qlogq 2

)(11)

Csimple2,q (PW,PRDM) =

1

2

(1 +

1

q

)logq(q + 1)− logq 2.

(12)

• Unreadable Digit Model: Here the coalition is allowedto generate erasures but not unseen symbols. For q =2 and 3, the colluders’ optimal attack does not utilizeerasures even they are allowed to do so. For q ≥ 4, p(2,q)

e,2is nonzero and randomization of W is necessary.

• Arbitrary/General/Combined Digit Model: In thesemodels p(2,q)

u,2 and/or p(2,q)e,2 are nonzero and randomization

of W is necessary for some cases for q = 3.For models in which the solutions are nontrivial, numerical

solutions can be obtained by solving the Karush-Kuhn-Tuckersystem of equations.3 Here we show numerical solutions forthe joint capacity game for binary and ternary alphabets arein Table II. The simple capacity game has similar results.We see that the ADM does allow stronger attacks than the

3We do not explicitly write down the equations here. See [10, (65-66)] forthe equations in the binary case under the marking assumption.

UDM, which essentially says that the coalition would prefergenerating unseen symbols over erasures. A simple way toexplain this is that when observing an erasure symbol, thedecoder knows (for sure or with high probability) that thecolluders have more than one symbol in that coordinate. Onthe other hand, generating an unseen symbol can divert thedecoder to accuse some innocent users. Such preference canalso be seen by observing the solutions to the MDM withdifferent parameters.

IV. ASYMPTOTICS FOR LARGE COALITIONS

The analysis in Sec. III allows us to solve the fingerprintinggames for small coalitions. However, the number of unknownvariables of the saddle-point solution is in the order of kq−1.For the binary case this grows linearly with k and we showedthe solutions for up to tens of pirates [10]. For the nonbinarycase, the curse of dimensionality makes the task of solvingnon-linear equations difficult even for small k. Therefore goodapproximations of the games’ solutions for large-coalitionasymptotics (k → ∞) are important. Independently of ourwork, Boesten and Skoric [14] extended our asymptotic resultsin [10] to nonbinary alphabets in the RDM. The analysisshowed that the asymptotic joint capacity is (q−1)/(2k2 ln q)but it did not reveal the optimal strategies. Here we show howto extend our analysis of [10] to the more general MDM forboth the joint and simple capacity games and derive optimalstrategies.4

A. Main Results

We consider the sequence of joint/simple fingerprintinggames. Similarly to [10], we impose a regularity conditionon the collusion channel Θ. The q-dimensional simplex isdenoted by G , {g ∈ Rq+1 : ‖g‖1 = 1,g ≥ 0}.

Condition 1: 5 There exists a bounded and twice differen-tiable function g :W → G with gy(w) ∈ (0, 1), y ∈ Y for win the interior of W such that6

θ(z) = g

(1

kz

)+ o

(1

k

)(13)

as k →∞.Condition (13) assumes a smooth limiting function of z

for the sequence of collusion channels. For the binary case,the assumption seems valid for the joint capacity game butnot for the simple capacity game [10]. However, as the simpledecoding scheme is suboptimal, this assumption does not seemoverly restrictive.

4In [10] we also provided exact upper and lower bounds for the capacity.However, they do not seem to be as valuable as the asymptotic results.Therefore we only generalize the asymptotic results in this paper.

5We would like to point out that our regularity constraint is more relaxedthan that of [14, Sec. 3.1], where the o(1/k) term is not allowed.

6The little-o notation is defined uniformly over all z ∈ Z .

6

TABLE IISADDLE-POINT SOLUTIONS FOR THE JOINT CAPACITY GAME: COALITION SIZE k = 2 AND ALPHABET SIZE q = 2, 3

q Model Cjoint2,q

Embedding distribution Collusion channelw(2,q)′ p

(2,q)W p

(2,q)u,1 p

(2,q)u,2 p

(2,q)e,1 p

(2,q)e,2

2

(R,U,A,G)DM .25 [.5, .5] 1 0 0 0 0MDM1 .1806 [.5, .5] 1 .01 .01 .20 .1980MDM2 .1648 [.5, .5] 1 .05 .05 .06 .0570MDM3 .1274 [.5, .5] 1 .10 .10 .03 .0270

3

(R,U)DM .2897 [.3333, .3333, .3333] 1 0 0 0 0*(A,G)DM .1875 [.0589, .4706, .4706] .3333 0 .1731 0 0

MDM1 .1929 [.3333, .3333, .3333] 1 .01 .01 .20 .4838MDM2 .1826 [.3333, .3333, .3333] 1 .05 .05 .06 .2159MDM3 .1428 [.3333, .3333, .3333] 1 .10 .10 .03 .0981

*Note: The optimal distribution takes probability mass 1/3 at permutations of w(2,q) shown. In all other rows the optimal distributions take w(2,q) withprobability one.

Define7

T [w,g,∇g]

,∑y∈Y

∇gy(w) [diag(w)−ww′]∇′gy(w)

gy(w)(14)

=∑y∈Y

1

gy(w)

∑x∈Q

wx

(∂gy(w)

∂wx

)2

−

(∑x∈Q

wx∂gy(w)

∂wx

)2

(15)

where ∇g is the (q + 1)× q Jacobian matrix of g versus w,∇gy(w) denotes the gradient row vector of gy with respect tow, and diag(w) denotes the diagonal matrix whose diagonalentries are the components of w. Let A(Pe,Pc) be the valueof the following functional maximin game:

A(Pe,Pc) , supPW

infg

E {T [W,g,∇g]} (16a)

= supPW

infg

∫WT [w,g,∇g] dPW(w) (16b)

where the maximization is subject to PW ∈ Pe and theminimization is subject to constraints of Pc. Our main resultsare the following three theorems and two corollaries:

Theorem 2: Assume that the regularity Condition 1 is sat-isfied. Then we have

C jointk,q (Pe,Pc) ∼ Csimple

k,q (Pe,Pc) ∼ A(Pe,Pc)

2k2 ln q(17)

when A(Pe,Pc) (specified in (16)) is nonzero.8 Note thatA depends on the embedding distribution class Pe and thecollusion class Pc and is independent of the coalition size k.

Theorem 3: The binary (q = 2) fingerprinting capacities inthe MDM with 0 ≤ re,1 = re,2 ≤ 1 and 0 ≤ ru,1 < (1−re,1)/2satisfy

C jointk,2 (PMDM) ∼ Csimple

k,2 (PMDM)

∼(1− re,1)

(1− 4

π sin−1√

ru,11−re,1

)2

2k2 ln 2. (18)

7The expression T was first introduced in [14].8With a slight abuse of notation, we use the same notation Pc as in Sec.

II where coalition size k is fixed. The collusion class Pc here represents asequence of collusion channels as k →∞.

Furthermore, the maximizing and minimizing strategies thatachieve the asymptotic capacity value are respectively thearcsine distribution dP ∗W(w0, w1) ∝ (w0w1)−1/2dw and θ(z)defined by

θ0(z) = (1− re,1) cos2 γ1 (19a)

θ1(z) = (1− re,1) sin2 γ1 (19b)θe(z) = re,1 (19c)

where

γ1 =

(1− 4

πγmax

)φ1 + γmax (20a)

φ1 = sin−1

√z1

k(20b)

γmax = sin−1

√ru,1

1− re,1. (20c)

For 0 ≤ re,1 = re,2 ≤ 1 but ru,1 ≥ (1 − re,1)/2, wehave C joint

k,2 (PMDM) = Csimplek,2 (PMDM) = 0 and the minimizing

strategy is θ(z) ≡ [(1− re,1)/2, (1− re,1)/2, re,1]′ for all z.Corollary 1: For q = 2, the RDM is the special case of the

MDM with re,1 = re,2 = ru,1 = 0, in which we have

C jointk,2 (PRDM) ∼ Csimple

k,2 (PRDM) ∼ 1

2k2 ln 2. (21)

The arcsine distribution dP ∗W(w0, w1) ∝ (w0w1)−1/2dw andthe interleaving attack

θ∗(z) =

[1kz0

](22)

are the respective maximizing and minimizing strategies thatachieve the asymptotic capacity value.

Theorem 4: The fingerprinting capacities in the MDM with0 ≤ re,1 = · · · = re,q ≤ 1 and ru = 0 satisfy

C jointk,q (PMDM) ∼ Csimple

k,q (PMDM) ∼ (1− re,1)(q − 1)

2k2 ln q. (23)

Furthermore, dP ∗W(w) ∝∏x w−1/2x dw and

θ(z) =

[1−re,1k zre,1

](24)


7

Corollary 2: The RDM is the special case of the MDMwith re = 0 and ru = 0, in which we have

C jointk,q (PRDM) ∼ Csimple

k,q (PRDM) ∼ q − 1

2k2 ln q. (25)

The distribution dP ∗W(w) ∝∏x w−1/2x dw and the interleav-

ing attack

θ∗(z) =

[1kz0

](26)


Theorem 2 follows from Theorem 5, which is stated in Sec.IV-B and proved in Appendix A. Theorems 3-4 and Corollaries1-2 can be shown by solving the functional maximin gameof (16). The proofs are outlined in Sec. IV-C and IV-D anddetailed in Appendices B and C.

B. Continuous Functional Game Approximation

Under the regularity constraint, the following theorem isinstrumental to prove Theorem 2. It reduces the payoff func-tions for the joint and simple capacity games to a functionalT [w,g,∇g]. The proof is given in Appendix A.

Theorem 5: Assume that the regularity Condition 1 is sat-isfied, then

I jointk,q (w,Θ) ∼ Isimple

k,q (w,Θ) ∼ T [w,g,∇g]

2k2 ln q(27)

as k →∞.

C. Change of Variables

Theorem 2 can be shown by substituting (27) and (16)into (7), and solving the asymptotics of the joint and simplecapacity games now reduces to solving the functional max-imin game of (16). Before doing so, we first simplify thepayoff functional T by making the following two changesof variables. The first transforms (w,g) on the product ofsimplices W × G to (v,h) on the product of hyperspheresV × H defined by V , {v ∈ Rq : ‖v‖2 = 1,v ≥ 0} andH ,

{h ∈ Rq+1 : ‖h‖2 = 1,h ≥ 0

}, which is similar to the

change of variables of [14, Sec. 3.4]. The second further trans-forms (v,h) into hyperspherical coordinates (φ,γ), whichreduces T to the square of the translated Frobenius normof the Jacobian matrix ∇γ of γ versus φ. This reduces thecontinuous functional game of (16) to

A(Pe,Pc) = supPΦ

infγ

∫Φ

T [φ,γ,∇γ] dPΦ(φ) (28)

where T is defined in (58) and maximization and minimizationare subject to constraints on PΦ and γ corresponding to PW ∈Pe and g ∈ Pc. The details of the change of variables aregiven in Appendix B.

D. Solving the Continuous Functional Game

In [10] we solved the game (16) for the binary case(q = 2) under the marking assumption (which is equivalentto the RDM, UDM, ADM, and GDM) without the changeof variables trick in Sec. IV-C. The minimization problem of(16) was solved explicitly for fixed PW using the Cauchy-Schwarz inequality, and the maximization problem can then besolved using another Cauchy-Schwarz inequality. Indeed sincethe game of (16) has a convex payoff functional with respectto g (analogously to Theorem 1), there exists a saddle-pointsolution pair (P∗W,g∗), which are the arcsine distributionand the interleaving attack for the binary case. They arethe asymptotic maximizing and minimizing strategies for thecapacity game.

For the binary and nonbinary case in the general mixed digitmodel, the Cauchy-Schwarz trick does not yield an analyticalsolution in general. However, convexity of the payoff func-tional and the saddle-point property still hold. Thus if we canfind a minimizer γ(φ) for a fixed P ∗Φ such that T is a constantfor all φ (such γ(φ) is also called an “equalizing strategy”),then

(P ∗Φ,γ(φ)

)is the maximizing-minimizing saddle-point

strategy pair, and the value A(Pe,Pu) of the game equals theconstant value of T (See [18, 5.2]). Saddle-point solutions ofTheorems 3 and 4 are shown in Appendix C.

V. DISCUSSION

Analyzing the fingerprinting capacity games becomesharder as the number of pirates k grows. As discussed inSec. IV, the game can be already unmanageable when k is insingle digits. Therefore it is crucial to study the large-coalitionfingerprinting games. In Sec. IV we studied these asymptoticsfor various setups. We showed that the capacity is asymptoticto A/(2k2 ln q) where the constant A is explicitly specified.The asymptotic optimal strategies of the fingerprinting capac-ity game were also obtained.

The capacity game results are mostly negative. Thequadratic decay of the capacity suggests that the requiredcodelength for a fingerprinting code that is robust to large-coalition attacks is much longer than that for a code that isrobust to small-coalition attacks. Furthermore, the capacityvalues under the marking assumption derived in previouswork are too optimistic and are about 20-30% less in theMDM. However, the fact that the simple capacity game isasymptotically the same as the joint capacity game suggestsa positive result: One can essentially achieve the maximumachievable rate with computationally feasible codes. Alongthis line of work, one interesting problem to explore is howto design an effective fingerprinting code that approachesthe capacity. Tardos’ construction [4] and its q-ary alphabetgeneralization [15] already achieved the same 1/k2 rate asthe capacity, but how to achieve or approach the same constantfactor A remains an open problem. Many recent results haveshown that theoretical capacity results can be used to constructpractical fingerprinting codes [6], [7], [8], [9]. Specifically,Oosterwijk et al. [7] improved upon an earlier design [15]by changing the distribution of the time-sharing variable W

from dPW(w) ∝∏x w−1/qx dw to dP ∗W(w) ∝

∏x w−1/2x dw,

8

and by tailoring the so-called “suspicion function” against theinterleaving attack. These match exactly our maximizing andminimizing strategies for the asymptotic fingerprinting gamein the RDM (Corollary 2)! Therefore, we believe that, bystudying the asymptotic capacity game for the general modelof the MDM, one can construct an efficient fingerprinting codeagainst practical collusion attacks.

APPENDIX APROOF OF THEOREM 5

By the definitions in Sec. II, the conditional distribution ofZ given w follows the multinomial law with parameter w andk trials. We denote the probability mass functions by

αz(w) , pZ|W(z|w) =

(k

z

) q−1∏x=0

wzxx (29)

for z ∈ Z and w ∈ Wq where(kz

)is the multinomial

coefficient defined by(kz

)= k!

z0!z1!···zq−1! . Let α(w) =

{αz(w)}z∈Z be a length-|Z| probability vector. Similarly

αxz(w) , pZ|X1W(z|x,w) =

(k − 1

z− ex

) q−1∏l=0

wzl−δxl

l (30)

for x ∈ X , z ∈ Z , and w ∈ Wq where ex is the unit vectorwith the x-th coordinate equals one. The dependency of α andαx on w is sometimes dropped for simplicity. We have thefollowing moments of Z:

E[Z|W = w] = kw (31)Cov(Z|W = w) = k(diag(w)−ww′) (32)

E[‖Z− kW‖32 |W = w

]= O(k) (33)

E[Z|W = w, X1 = x] = ex + (k − 1)w (34)Cov(Z|W = w, X1 = x) = (k − 1)(diag(w)−ww′) (35)

E[‖Z− ex − (k − 1)w‖32 |W = w, X1 = x

]= O(k). (36)

The following lemmas will be useful for our analysis:Lemma 1: 9 Let a and b be two probability vectors such

that the difference c = b− a satisfies ε , maxi |ci/ai| <∞.Then

Dq(a ‖ b) =1

2 ln q

∑i

(ai − bi)2

ai+O(ε3) (37)

as ε→ 0.

9It has been brought to our attention that [10, Lemma 5] which we citedfrom [19, Sec. 2.5] needs additional conditions to hold. To resolve the issuewe use Lemma 1 in this paper. The proofs of the asymptotic results in [10,Sec. V] can be fixed utilizing Lemma 1 as well.

Proof: We have

Dq(a ‖ b)

=∑i

ai logqaibi

= − 1

ln q

[∑i

ai ln

(1 +

ciai

)]

= − 1

ln q

{∑i

ai

[ciai− 1

2

(ciai

)2

+O

((ciai

)3)]}

=1

2 ln q

∑i

(ai − bi)2

ai+O(ε3).

Lemma 2: For Z following the multinomial law with pa-rameters w and k trials, we have

Pr[‖Z− kw‖2 ≥ q

√2k ln k

]= O

(1

k4

). (38)

Proof: We have

Pr[‖Z− kw‖2 ≥ q

√2k ln k

]≤ Pr

[∃x ∈ Q : |Zx − kwx| ≥

√2k ln k

](a)

≤∑x∈Q

Pr[|Zx − kwx| ≥

√2k ln k

] (b)

≤ 2q

k4

where (a) follows from the union bound and (b) follows fromHoeffding’s inequality [20].

To prove Theorem 5, we first rewrite the mutual informationof the payoff functions as

I jointk,q (w,Θ)

,1

kI(Z;Y |W = w)

=1

k

∑z

pZ|W(z|w)∑y

pY |Z,W(y|z,w) logqpY |Z,W(y|z,w)

pY |W(y|w)

(a)=

1

k

∑z

αz(w)∑y

θy(z) logqθy(z)

(Θα)y

=1

k

∑z

αz(w)Dq(θ(z) ‖ Θα) (39)

where (a) follows from (1) and (29), and

Isimplek,q (w,Θ)

, I(X1;Y |W = w)

=∑x∈Q

pX1|W(x|w)∑y

pY |X1,W(y|x,w) logqpY |X1,W(y|x,w)

pY |W(y|w)

(a)=∑x∈Q

wx∑y

(Θαx)y logq(Θαx)y(Θα)y

=∑x

wxDq(Θαx ‖ Θα). (40)

9

where (a) follows from (4) and (30). By Condition 1 andTaylor’s Theorem, we have for each y ∈ Y ,

θy(z) = gy

(1

kz

)+ o

(1

k

)= gy(w) +

1

k∇gy(w) (z− kw)

+1

2k2(z− kw)

′∇2gy(w) (z− kw)

+O

(1

k3‖z− kw‖32

)+ o

(1

k

)(41)

as k →∞, where ∇2gy is the q×q Hessian matrix. Therefore

(Θα)y

= E [θy(Z)|W = w]

= gy(w) +1

k∇gy(w)E [Z− kW|W = w]︸︷︷︸

(a)= 0

+O

(1

k3E[‖Z− kW‖32 |W = w

])︸︷︷︸

(b)=O(1/k2)

+o

(1

k

)

= gy(w) +1

2k2tr[∇2gy(w)Cov (Z|W = w)

]+ o

(1

k

)(c)= gy(w) +

1

2ktr[∇2gy(w) (diag(w)−ww′)

]+ o

(1

k

)(42)

where (a), (b), and (c) follow from (31), (33), and (32)respectively. Similarly

(Θαx)y

= E [θy(Z)|W = w, X1 = x]

= gy(w) +1

k∇gy(w)E [Z− kW|W = w, X1 = x]︸︷︷︸

(a)= ex−w

+1

2k2E[(Z− kW)

′∇2gy(W) (Z− kW) |W = w, X1 = x]

+O

(1

k3E[‖Z− kw‖32 |W = w, X1 = x

])︸︷︷︸

(b)=O(1/k2)

+o

(1

k

)

= gy(w) +1

k∇gy(w)(ex −w)

+1

2k2tr{∇2gy(w)E

[(Z− kw) (Z− kw)

′ |W = w, X1 = x]}

+ o

(1

k

)= gy(w) +

1

k∇gy(w)(ex −w)

+1

2k2tr{∇2gy(w)[Cov(Z|W = w, X1 = x)

+ (ex −w)(ex −w)′]}+ o

(1

k

)(c)= gy(w) +

1

k∇gy(w)(ex −w)

+k − 1

2k2tr[∇2gy(w)(diag(w)−ww′)

]+ o

(1

k

)

= gy(w) +1

k∇gy(w)(ex −w)

+1

2ktr[∇2gy(w) (diag(w)−ww′)

]+ o

(1

k

)(43)

where (a), (b), and (c) follow from (34), (36), and (35)respectively.

Now we have

I jointk,q (w,Θ)

=1

k

∑z

αz(w)Dq(θ(z) ‖ Θα)

(a)∼ 1

k

∑z:‖z−kw‖2≤q

√k ln k

αz(w)Dq(θ(z) ‖ Θα) (44)

where (a) follows from Lemma 2. If we let η = k−1/2(z−kw)where η = OP (1) and ‖η‖2 ≤ q

√2 ln k, then combining (41)

and (42) yields

θy(z)− (Θα)y =1√k∇gy(w)η +O

(1

k‖η‖22

)+O

(1

k

).

(45)Now we use (45) and Lemma 1 with a = θ(z), b = Θα, and

ε ∼ 1√k

maxy

supη:‖η‖≤q

√2 ln k

∣∣∣∣∇gy(w)η

gy(w)

∣∣∣∣=

√2 ln k

kmaxy‖∇ ln gy(w)‖2 = O

(√ln k

k

).

We obtain

Dq(θ(z) ‖ Θα)

=1

2 ln q

∑y

[1√k∇gy(w)η +O

(1k‖η‖

22

)+O

(1k

)]2gy(w)

+O(ε3)

=1

2k ln q

∑y

[∇gy(w)η]2

gy(w)+O

(ln k

k

)3/2

(46)

hence

I jointk,q (w,Θ)

=1

k

∑z:‖z−kw‖2≤q√k ln k

αz(w)Dq(θ(z) ‖ Θα) +O

(1

k4

)

(a)∼ 1

k

∑η:‖η‖2≤q√

2 ln k

αz(w) · 1

2k ln q

∑y

[1k∇gy(w)(η)

]2gy(w)

=1

2k2 ln q

∑y

1

gy(w)

∑z:‖z−kw‖2≤q√

2 ln k

αz(w) [∇gy(w)η]2

(b)∼ 1

2k3 ln q

∑y

1

gy(w)

∑z

αz(w) [∇gy(w)(z− kw)]2

=1

2k3 ln q

∑y

∇gy(w)Cov(Z|W = w)∇′gy(w)

gy(w)

(c)=

1

2k2 ln q

∑y


gy(w)(47)

10

where (a) follows from (46), (b) from Lemma 2, and (c) from(32).

The derivation of Isimplek,q (w,Θ) is simpler. Combining (43)

and (42) yields

(Θαx)y − (Θα)y =1

k∇gy(w)(ex −w) + o

(1

k

). (48)

Now we use (48) and Lemma 1 with a = Θαx, b = Θα,and

ε ∼ 1

kmaxy

∣∣∣∣∇gy(w)(ex −w)

gy(w)

∣∣∣∣ = O

(1

k

). (49)

We obtain

Dq(Θαx ‖ Θα)

=1

2 ln q

∑y

[(Θαx)y − (Θα)y]2

(Θαx)y+O(ε3)

=1

2k2 ln q

∑y

[∇gy(w)(ex −w)]2

gy(w)+O

(1

k3

)(50)

hence

Isimplek,q (w,Θ)

=∑x

wxDq(Θαx ‖ Θα)

(a)∼ 1

2k2 ln q

∑x

wx∑y

[∇gy(w)(ex −w)]2

gy(w)

=1

2k2 ln q

∑y

1

gy(w)

· ∇gy(w)

[∑x

wx(ex −w)(ex −w)′

]∇′gy(w)

=1

2k2 ln q

∑y


gy(w)(51)

as k →∞, where (a) follows from (50).

APPENDIX BCHANGE OF VARIABLES IN SEC IV-C

Let vx =√wx and hy =

√gy for all x ∈ Q and y ∈ Y

and we have∂gy∂wx

=∂gy∂hy· ∂hy∂vx· ∂vx∂wx

=hyvx· ∂hy∂vx

, x ∈ Q, y ∈ Y.

Therefore

T [w,g,∇g]

=∑y∈Y

1

gy(w)

∑x∈Q

wx

(∂gy∂wx

)2

−

(∑x∈Q

wx∂gy∂wx

)2

=∑y∈Y

1

h2y(v)

∑x∈Q

v2x

(hyvx· ∂hy∂vx

)2

−

(∑x∈Q

vx ·hyvx· ∂hy∂vx

)2

=∑y∈Y

∑x∈Q

(∂hy∂vx

)2

−∑y∈Y

(∑x∈Q

vx∂hy∂vx

)2

= ‖∇h(v)‖2F − ‖(∇h(v))v‖22 , T [v,h,∇h] (52)

where v , [v0, . . . , vq−1]′ ∈ Rq , h , [h0, . . . , hq−1, hq =he]′ ∈ Rq+1, ∇h is the (q+1)×q Jacobian matrix of h versus

v, ‖·‖F is the Frobenius norm, and ‖·‖2 is the Euclidean norm.The constraints w ∈ W and g ∈ G translate to v ∈ V and h ∈H where V and H are respectively the orthant of the (q− 1)-dimensional hypersphere V , {v ∈ Rq : ‖v‖2 = 1,v ≥ 0}and the orthant of the q-dimensional hypersphere H ,{h ∈ Rq+1 : ‖h‖2 = 1,h ≥ 0

}.

We then transform (v,h) to hyperspherical coordinates(φ,γ). Let vji , 0 ≤ i ≤ j ≤ q−1 be the vector [vi, · · · , vj ]′ ∈Rj−i+1, φ , [r = φ0, φ1, · · · , φq−1]′ ∈ Rq be defined as

r = ‖v‖2 (53a)

φi = cot−1 vi−1

‖vq−1i ‖2

, 1 ≤ i ≤ q − 1 (53b)

which has the inverse transformation

v0 = r cosφ1 (54a)v1 = r sinφ1 cosφ2 (54b)

...vq−2 = r sinφ1 · · · sinφq−2 cosφq−1 (54c)vq−1 = r sinφ1 · · · sinφq−2 sinφq−1, (54d)

and γ , [R = γ0, γ1, · · · , γq−1, γq]′ ∈ Rq+1 be defined as

R = ‖hq−10 ‖2 (55a)

γj = cot−1 hj−1

‖hq−1j ‖2

, 1 ≤ j ≤ q − 1 (55b)

γq = he (55c)

which has the inverse transformation

h0 = R cos γ1 (56a)h1 = R sin γ1 cos γ2 (56b)

...hq−2 = R sin γ1 · · · sin γq−2 cos γq−1 (56c)hq−1 = R sin γ1 · · · sin γq−2 sin γq−1 (56d)he = γq. (56e)

Now since∇h = Jhγ∇γJ

φv where Jh

γ is the (q+1)×(q+1)

Jacobian matrix of h versus γ, Jφv is the q×q Jacobian matrix

of φ versus v, and ∇γ is the (q + 1)× q Jacobian matrix ofγ versus φ, we have

T [v,h,∇h]

= ‖∇h‖2F − ‖(∇h)v‖22= tr (∇h′∇h)− v′(∇h)′(∇h)v

= tr[(

Jφv

)′(∇γ)′

(Jhγ

)′Jhγ∇γJ

φv

]− v′

(Jφv

)′(∇γ)′

(Jhγ

)′Jhγ∇γJ

φv v

= tr[(∇γ)′

(Jhγ

)′Jhγ∇γJ

φv

(Jφv

)′]− tr

[(∇γ)′

(Jhγ

)′Jhγ∇γJ

φv vv′

(Jφv

)′]

11

= tr{

(∇γ)′(Jhγ

)′Jhγ∇γ

[Jφv

(Jφv

)′− J

φv vv′

(Jφv

)′]}, T [φ,γ,∇γ]

where tr(A) denotes the trace of square matrix A. Finally weapply the following lemma:

Lemma 3: [21, Lemma A.5] By the definition of (53), (54),(55), and (56) we have(

Jhγ

)′Jhγ = diag

(1, R2, R2 sin2 γ1, R

2 sin2 γ1 sin2 γ2,

· · · , R2

q−2∏l=1

sin2 γl, 1

)(57a)

Jφv

(Jφv

)′= diag

(1, r−2, r−2 sin−2 φ1, r

−2 sin−2 φ1

· sin−2 φ2, · · · , r−2

q−2∏l=1

sin−2 φl

)(57b)

Jφv vv′

(Jφv

)′= diag(r2, 0, · · · , 0). (57c)

By incorporating the constraint r = ‖v‖2 ≡ 1 we have

T [φ,γ,∇γ] = tr[(∇γ)′Γ

2∇γΦ

−2]

=∥∥∥Γ∇γΦ

−1∥∥∥2

F(58a)

=∑

1≤i≤q−10≤j≤q

(∂γj∂φi· γjφi

)2

(58b)

where

Γ , diag(γ0, · · · , γq)

, diag

(1, R,R sin γ1, R sin γ1 sin γ2,

· · · , Rq−2∏j=1

sin γj , 1

)(59)

Φ−1

, diag(0, φ1

−1, · · · , φq−1

−1)

, diag

(0, 1, sin−1 φ1, sin

−1 φ1 sin−1 φ2,

· · · ,q−2∏i=1

sin−1 φi

). (60)

The constraints v ∈ V and h ∈ H translate to φ ∈ Φ andγ ∈ Γ where Φ and Γ are respectively the (q−1)-dimensionalspace Φ , {φ ∈ Rq : r = 1, 0 ≤ φi ≤ π/2, 1 ≤ i ≤ q − 1}and the q-dimensional space Γ ,{γ ∈ Rq+1 : R2 + γ2

q = 1, 0 ≤ γj ≤ π/2, 1 ≤ j ≤ q − 1}

.

APPENDIX CPROOFS OF THEOREMS 3-4 AND COROLLARIES 1-2

We use variational calculus to solve the minimization prob-lem where a set of Euler-Lagrange differential equations,Lagrange multipliers, and boundary conditions are specified

as necessary and sufficient conditions. Saddle-point solutionscan be obtained analytically for various cases. For details ofmethod of variational calculus see [22], [23].

A. Binary Alphabet with the Marking Assumption

We start with the simplest case where q = 2, re = 0, andru = 0. Instead of solving the game (16) as in [10, Sec. V-A] we solve the much simpler transformed game (28). We fixthe uniform distribution dP ∗Φ(φ) ∝ dφ which is equivalentto dP ∗W(w0, w1) ∝ (w0w1)−1/2dw. The constraints of themarking assumption translate to R ≡ 0 and γ2 ≡ 0 andtherefore we have the payoff function

T [φ1,γ,∇γ] =

(∂R

∂φ1

)2

+R2

(∂γ1

∂φ1

)2

+

(∂γ2

∂φ1

)2

=

(∂γ1

∂φ1

)2

.

The boundary conditions g0(w0 = 0) = 0 and g1(w1 = 0) = 0translate to γ1(φ1 = 0) = 0 and γ1(φ1 = π/2) = π/2. Thus∫

Φ

T [φ1,γ,∇γ] dP ∗Φ(φ) ∝∫ π/2

0

(dγ1

dφ1

)2

dφ1

(a)

≥

(∫ π/20

dγ1dφ1

dφ1

)2

∫ π/20

1 · dφ1

(b)=π

2

where (a) follows from the Cauchy-Schwarz inequality and(b) follows from the boundary conditions. Equality holds in (a)when dγ∗1

dφ1= 1 or γ∗1 = φ1, which translates to the interleaving

attack of (1). The maximin value of the game A(PRDM) =

T |γ∗ ≡(dγ∗1dφ1

)2

= 1. By Theorem 2 we establish Corollary1.

B. Binary Alphabet with the Combined Digit Model

We first examine the setup 0 ≤ re,1 = re,2 ≤ 1 and 0 ≤ru,1 < (1 − re,1)/2. Again we fix the uniform distributiondP ∗Φ(φ) ∝ dφ. The payoff function can be written as

T [φ1,γ,∇γ] =

(∂R

∂φ1

)2

+R2

(∂γ1

∂φ1

)2

+

(∂γ2

∂φ1

)2

=(J0

1

)2+R2

(J1

1

)2+(J2

1

)2where we use the notation Jji , ∂γj

∂φiinterchangeably. The

constraints of (2) can be simplified as ge ≤ re,1, g0(w0 = 0) ≤ru,1 and g1(w1 = 0) ≤ ru,1, which translate to γ2 ≤

√re,1,

γ1(φ1 = 0) ≤ γmax and γ1(φ1 = π/2) ≥ π/2 − γmax whereγmax , sin−1

√ru,1

1−re,1. Therefore we are interested in the

following minimization problem:

Minimize∫ π/2

0

T [φ1,γ,∇γ] dφ1 (61a)

subject to Global constraints:

G1[γ] , R2 + γ22 − 1 = 0 (61b)

G2[γ] , γ2 −√re,1 ≤ 0 (61c)

Boundary conditions:γ1(φ1 = 0) ≤ γmax (61d)

12

γ1(φ1 = π/2) ≥ π/2− γmax. (61e)

The Cauchy-Schwarz inequality cannot help us here so weturn to variational calculus:

(i) Euler-Lagrange Equations: By [22, Sec. 12], if a func-tional γ(φ1) minimizes (61a), then there exists λ1(φ1)and λ2(φ1) satisfying

∂T

∂γj+λ1

∂G1

∂γj+λ2

∂G2

∂γj=

∂

∂φ1

(∂T

∂Jj1

), 0 ≤ j ≤ 2.

(62)We verify the functional γ(φ1) defined by R ≡√

1− re,1, γ1 =(1− 4

πγmax

)φ1 + γmax, and γ2 ≡

√re,1 satisfies (62) with λ1 = −

(1− 4

πγmax

)2and

λ2 = −λ1√re,1:10

j = 0 : LHS = 2R(J1

1

)2+ 2λ1R

= 0 =∂

∂φ1(2J0

1 ) = RHS

j = 1 : LHS = 0 =∂

∂φ1(2R2J1

1 ) = RHS

j = 2 : LHS = 2λ1γ2 + 2λ2 = 0 =∂

∂φ1= RHS

(ii) Global Constraints: The inequality constraint G2 ≤ 0is active for the solution so we need λ2 ≥ 0, which istrue since λ1 < 0.

(iii) Boundary Conditions: Both boundary conditions areactive for the solution so we also need ∂T

∂J11

∣∣∣φ1=0

≥ 0

and ∂T∂J1

1

∣∣∣φ1=π/2

≥ 0, which can be verified by ∂T∂J1

1=

2R2J11 > 0 (∵ γmax < π/4).

The minimizer γ(φ1) translates to (19) and the max-imin value of the game is A(PMDM) = T |γ ≡ (1 −re,1)

(1− 4

π sin−1 γmax

)2. Therefore by Theorem 2 we estab-

lish the first part of Theorem 3.11

If 0 ≤ re,1 = re,2 ≤ 1 but ru,1 ≥ (1 − re,1)/2, we can letγ be defined as R ≡

√1− re,1, γ1 ≡ π/4, and γ2 =

√re,1

which yields θ(z) ≡ [(1 − re,1)/2, (1 − re,1)/2, re,1]′ for allz and T ≡ 0, i.e., this is the case where the coalition cangenerate the same distribution of forgery symbols regardlessof what they receive, hence the capacity is zero.

One may try to solve the game for the general setup where0 ≤ re,1 < re,2 ≤ 1. However, this results in a non-compactcollusion class Pc, and we cannot find a solution that satisfies(i)-(iii).

C. General Alphabet with the Combined Digit Model

Finally we examine the general alphabet case where q ≥ 2.We start with the case where 0 ≤ re,1 = · · · = re,q ≤ 1 and

10LHS stands for left-hand side and RHS stands for right-hand side.11We note that by [23, Chap. 2] we indeed need all (i-iii) to verify the

minimizer γ(φ). As a counter example, the functional γ(φ1) defined byR ≡

√1− re,1, γ1 = φ1, and γ2 ≡

√re,1 satisfy (a) Euler-Lagrange

Equations with λ1 = −1 and λ2 =√re,1 and (b) λ2 > 0, but we also need

(c) ∂T∂J1

1

∣∣∣∣φ1=0

= 0 for inactive boundary condition γ1(φ1 = 0) < γmax,

which isn’t true.

ru = 0. We fix the distribution

dP ∗Φ(φ) ∝q−1∏i=1

φidφ =

q−1∏i

(sinφi)q−i−1dφ (63)

which is equivalent to dP ∗W(w) ∝∏x w−1/2x dw. The payoff

function can be written as

T [φ,γ,∇γ] =∑

1≤i≤q−10≤j≤q

(γj

φi· Jji)2

.

The constraints of (2) can be simplified as ge ≤ re,1 andgi(wi = 0) = 0 for 0 ≤ i ≤ q − 1, which translates toγq ≤

√re,1 and γi(φi = 0) = 0, γi(φi = π/2) = π/2 for

1 ≤ i ≤ q − 1. Therefore we are interested in the followingminimization problem:

Minimize∫ π/2

0

T [φ,γ,∇γ] dP ∗Φ(φ) (64a)

subject to Global constraints:

G1[γ] , R2 + γ2q − 1 = 0 (64b)

G2[γ] , γq −√re,1 ≤ 0 (64c)

Boundary conditions:γi(φi = 0) = 0, 0 ≤ i ≤ q − 1 (64d)γi(φi = π/2) = π/2, 0 ≤ i ≤ q − 1. (64e)

The minimizer γ(φ) is given by R ≡√

1− re,1, γq ≡√re,1,

and γi = φi for 1 ≤ i ≤ q − 1. Again we verify this solutionusing variational calculus:

(i) Euler-Lagrange Equations: By [22, Sec. 12] we verifythat the minimizer γ(φ) satisfies

∂

∂γj

(TdP ∗Φ(φ)

)+ λ1

∂G1

∂γjdP ∗Φ(φ) + λ2

∂G2

∂γjdP ∗Φ(φ)

=

q−1∑i=1

∂

∂φi

[∂

∂Jji

(TdP ∗Φ(φ)

)], 0 ≤ j ≤ q. (65)

with λ1 = −(q − 1) and λ2 = −2λ1√re,1:

• j = 0 (γ0 = R): we have

LHS(a)=

∑1≤i,l≤q−1

2(J li)2 · γl

φi2 ·

∂γl∂R

· dP ∗Φ(φ)

+ 2λ1R · dP ∗Φ(φ)

(b)=

(q−1∑i=1

2 · γiφi

2 ·γiR

)· dP ∗Φ(φ)

+ 2λ1R · dP ∗Φ(φ)

(c)= 2R(q − 1)dP ∗Φ(φ) + 2λ1R · dP ∗Φ(φ)

(d)= 0

RHS =

q−1∑i=1

∂

∂φi

[2J0i

φi2 · dP

∗Φ(φ)

](e)= 0

where (a) follows from (59), (b) and (e) follows fromJ li = 1 for 1 ≤ i = j ≤ q−1 and J li = 0 otherwise,(c) follows from γi = Rφi for 1 ≤ i ≤ q − 1, and(d) follows from R =

√1− re,1.

13

• 1 ≤ j ≤ q − 1: we have

LHS(a)=

q−1∑i=1

q−1∑l=j+1

2(J li)2 · γl

φi2 ·

∂γl∂γj

· dP ∗Φ(φ)

(b)=

q−1∑i=j+1

2 · γiφi

2 ·∂γi∂γj

· dP ∗Φ(φ)

(c)=

q−1∑i=j+1

2 · γi2

φi2 · cot γj

· dP ∗Φ(φ)

(d)= 2(q − j − 1)R2 cotφj · dP ∗Φ(φ)

RHS =

q−1∑i=1

∂

∂φi

[2Jji ·

γj2

φi2 · dP

∗Φ(φ)

](e)=

∂

∂φj

(2 · γj

2

φj2 · dP

∗Φ(φ)

)(f)=

∂

∂φj

(2R2 · dP ∗Φ(φ)

)(g)= 2R2 · dP ∗Φ(φ) · (q − j − 1) cotφj

where (a) follows from (59), (b) and (e) follows fromthe values of J li , (c) follows from ∂γi

∂γj= γi cot γj

for i ≥ j+ 1, (d) and (f) follows from γi = Rφi for1 ≤ i ≤ q − 1, and (g) follows from (63).

• j = q: we have

LHS(a)= 2λ1γqdP

∗Φ(φ) + λ2dP

∗Φ(φ) = 0

RHS =

q−1∑i=1

∂

∂φi

[2Jqi

φi2 · dP

∗Φ(φ)

](b)= 0

where (a) follows from (59) and (b) follows fromthe values of Jqi .

(ii) Global Constraints: The inequality constraint G2 ≤ 0is active for the solution so we need λ2 ≥ 0, which istrue since λ1 < 0.

(iii) Boundary Conditions: All boundary conditions areequalities so we do not need to verify any other con-ditions.

The minimizer γ(φ) translates to (24) and the maximin valueof the game is A(PMDM) = T |γ ≡ (q−1)(1−re,1). Thereforeby Theorem 2 we establish Theorem 4.

For the case where ru is nonzero, unfortunately, a linearfunction of φ does not yield a minimizer as in the binarycase. We leave this part as future work. Also similar to thebinary case, a general setup where 0 ≤ re,1 < · · · < re,q ≤ 1yields a non-compact collusion class Pc, which prevents usfrom finding the minimizer using variational calculus.

REFERENCES

[1] Y.-W. Huang and P. Moulin, “Saddle-point solution of the fingerprintingcapacity game under the marking assumption,” in Proc. IEEE Intl.Symposium on Information Theory (ISIT 2009), 2009, pp. 2256–2260.

[2] E. Amiri and G. Tardos, “High rate fingerprinting codes and thefingerprinting capacity,” in Proc. 20th Annual ACM-SIAM Symposiumon Discrete Algorithms (SODA 2009), 2009, pp. 336–345.

[3] P. Moulin. (2011, May) Universal fingerprinting: Capacity and random-coding exponents. [Online]. Available: http://arxiv.org/abs/0801.3837v3

[4] G. Tardos, “Optimal probabilistic fingerprint codes,” in Proc. 35th ACMSymposium on Theory of Computing (STOC 2003), 2003, pp. 116–125.

[5] Y.-W. Huang and P. Moulin, “Capacity-achieving fingerprint decoding,”in Proc. First IEEE Intl. Workshop on Information Forensics andSecurity (WIFS 2009), 2009, pp. 51–55.

[6] P. Meerwald and T. Furon, “Toward practical joint decoding of binaryTardos fingerprinting codes,” IEEE Trans. Inf. Forensics Security, vol. 7,no. 4, pp. 1168–1180, Aug. 2012.

[7] J.-J. Oosterwijk, B. Skoric, and J. Doumen, “Optimal suspicion functionsfor Tardos traitor tracing schemes,” in Proc. First ACM Workshop onInformation Hiding and Multimedia Security (IH&MMSec 2013), 2013,pp. 19–28.

[8] S. Ibrahimi, B. Skoric, and J.-J. Oosterwijk. (2013, Dec.)Riding the saddle point: asymptotics of the capacity-achievingsimple decoder for bias-based traitor tracing. [Online]. Available:http://eprint.iacr.org/2013/809

[9] T. Laarhoven. (2014, Apr.) Capacities and capacity-achievingdecoders for various fingerprinting games. [Online]. Available:http://arxiv.org/abs/1401.5688v3

[10] Y.-W. Huang and P. Moulin, “On the saddle-point solution and thelarge-coalition asymptotics of fingerprinting games,” IEEE Trans. Inf.Forensics Security, vol. 7, no. 1, pp. 160–175, Feb. 2012.

[11] D. Boneh and J. Shaw, “Collusion-secure fingerprinting for digital data,”IEEE Trans. Inf. Theory, vol. 44, no. 5, pp. 1897–1905, Sept. 1998.

[12] B. Skoric, S. Katzenbeisser, H. G. Schaathun, and M. U. Celik, “Tardosfingerprinting codes in the combined digit model,” IEEE Trans. Inf.Forensics Security, vol. 6, no. 3, pp. 906–919, Sept. 2011.

[13] D. Boesten and B. Skoric, “Asymptotic fingerprinting capacity in thecombined digit model,” in Information Hiding: 14th Intl. Conf., IH2012, Berkeley, CA, USA, May 15-18, 2012, Revised Selected Papers,ser. LNCS, vol. 7692. Springer-Verlag, 2013, pp. 255–268.

[14] ——, “Asymptotic fingerprinting capacity for non-binary alphabets,” inInformation Hiding: 13th Intl. Conf., IH 2011, Prague, Czech Republic,May 18-20, 2011, Revised Selected Papers, ser. LNCS, vol. 6958.Springer-Verlag, 2011, pp. 1–13.

[15] B. Skoric, S. Katzenbeisser, and M. U. Celik, “Symmetric Tardosfingerprinting codes for arbitrary alphabet sizes,” Designs, Codes andCryptography, vol. 46, no. 2, pp. 137–166, Feb. 2008.

[16] D. Blackwell, L. Breiman, and A. J. Thomasian, “The capacity of a classof channels,” The Annals of Mathematical Statistics, vol. 30, no. 4, pp.1229–1241, Dec. 1959.

[17] P. Moulin and J. A. O’Sullivan, “Information-theoretic analysis ofinformation hiding,” IEEE Trans. Inf. Theory, vol. 49, no. 3, pp. 563–593, Mar. 2003.

[18] J. Berger, Statistical Decision Theory and Bayesian Analysis, 2nd ed.New York, NY: Springer, 1985.

[19] F. Bavaud, “Information theory, relative entropy and statistics,” in FormalTheories of Information: From Shannon to Semantic Information Theoryand General Concepts of Information, ser. LNCS, vol. 5363. Berlin,Heidelberg: Springer-Verlag, 2009, pp. 54–78.

[20] W. Hoeffding, “Probability inequalities for sums of bounded randomvariables,” J. Amer. Statistical Assoc., vol. 58, no. 301, pp. 13–30, Mar.1963.

[21] Y.-W. Huang, “Asymptotic analysis for multi-user channels,” Ph.D.dissertation, University of Illinois at Urbana-Champaign, Champaign,IL, USA, 2013.

[22] I. M. Gelfand and S. V. Fomin, Calculus of Variations. Prentice-Hall,1963, vol. I.

[23] B. Dacorogna, Introduction to the Calculus of Variations. ImperialCollege Press, 2004.

14

Yen-Wei Huang received the B.S.E. degree in elec-trical engineering from National Taiwan University,Taipei, Taiwan, in 2004. He received the M.S. degreein 2010, and the Ph.D. degree in 2013, both in elec-trical and computer engineering, from the Universityof Illinois at Urbana-Champaign.

Dr. Huang was a recipient of the E. A. ReidFellowship Award from the University of Illinoisat Urbana-Champaign in 2012. He is currently withMicrosoft Corporation, Redmond, WA. His researchinterests include data hiding, information security,

statistical signal processing, and information theory.

Pierre Moulin (S’89–M’90–SM’98–F’03) receivedhis doctoral degree from Washington University, St.Louis in 1990, after which he joined Bell Communi-cations Research in Morristown, NJ, as a ResearchScientist. In 1996, he joined the University of Illinoisat Urbana-Champaign, where he is currently Profes-sor in the Department of Electrical and ComputerEngineering, Research Professor at the Beckman In-stitute and the Coordinated Science Laboratory, andaffiliate professor in the Department of Statistics.His fields of professional interest include image and

video processing, compression, statistical signal processing and modeling,media security, decision theory, and information theory.

Dr. Moulin has served on the editorial boards of the IEEE TRANSAC-TIONS ON INFORMATION THEORY, the IEEE TRANSACTIONS ON IMAGEPROCESSING, and the PROCEEDINGS OF IEEE. He currently serves on theeditorial boards of Foundations and Trends in Signal Processing. He wascofounding Editor-in-Chief of the IEEE TRANSACTIONS ON INFORMATIONFORENSICS AND SECURITY (2005–2008), member of the IEEE SignalProcessing Society Board of Governors (2005–2007), and has served IEEE invarious other capacities. He received a 1997 Career award from the NationalScience Foundation and an IEEE Signal Processing Society 1997 Senior BestPaper award. He is also coauthor (with Juan Liu) of a paper that receivedan IEEE Signal Processing Society 2002 Young Author Best Paper award. In2003 he became IEEE Fellow and Beckman Associate of UIUC’s Center forAdvanced Study. In 2007–2009 he was Sony Faculty Scholar at UIUC. He wasplenary speaker for ICASSP 2006, ICIP 2011, and several other conferences.He was Distinguished Lecturer of the IEEE Signal Processing Society for2012–2013.

On the Fingerprinting Capacity Games for Arbitrary ...moulin/Papers/TIFS_FP-0704.pdf · On the...

Documents

Transcript of On the Fingerprinting Capacity Games for Arbitrary ...moulin/Papers/TIFS_FP-0704.pdf · On the...