A Hierarchical Game Theoretic Framework for Cognitive...

17
1 A Hierarchical Game Theoretic Framework for Cognitive Radio Networks Yong Xiao, Guoan Bi, Senior Member, IEEE, Dusit Niyato, Member, IEEE, and Luiz A. DaSilva , Senior Member, IEEE Abstract—We consider OFDMA-based cognitive radio (CR) networks where multiple secondary users (SUs) compete for the available sub-bands in the spectrum of multiple primary users (PUs). We focus on maximizing the payoff of both SUs and PUs by jointly optimizing transmit powers of SUs, sub- band allocations of SUs, and the prices charged by PUs. To further improve the performance of SUs, we allow SUs who share the same sub-band to cooperate with each other to send and receive signals. To help us understand the interaction among SUs and PUs, we study the proposed network model from a game theoretic perspective. More specifically, we first formulate a coalition formation game to study the sub-band allocation problem of SUs and then integrate the coalition formation game into a Stackelberg game-based hierarchical framework. We propose a simple distributed algorithm for SUs to search for the optimal sub-bands. We prove that the transmit power and sub-band allocation of SUs and the price charged by PUs are interrelated by the pricing function of PUs. This makes the joint optimization possible. More impor- tantly, we prove that if the pricing coefficients of PUs have a fixed linear relationship, the sub-band allocation of SUs will be stable and the Stackelberg equilibrium of the hierarchical game framework will be unique and optimal. We propose a simple distributed algorithm to achieve the Stackelberg equilibrium of the hierarchical game. Our proposed algorithm does not require SUs to know the interference temperature limit of each PU, and has low communication overheads between SUs and PUs. Index Terms—Cognitive radio, power control, sub-band allocation, price adjustment, spatial spectrum sharing, game theory, coalitional game, Stackelberg equilibrium. I. I NTRODUCTION By allowing the mobile device to flexibly adapt its operation to the surrounding environment, cognitive radio (CR) networks have the potential to solve the spectrum over-subscription/under-utilization problem in wireless net- works. In a CR network, secondary users (SUs) can access the licensed spectrum allocated to primary users (PUs) if it is unoccupied or under-utilized. The main challenge for this network is that SUs need to intelligently decide what and how much licensed resources (e.g., time, spectrum Manuscript received January 5, 2012; revised May 16, 2012; accepted July 2, 2012. Y. Xiao and L. A. DaSilva are with CTVR, Trinity College Dublin, Ire- land (email: [email protected] and [email protected]). L. A. DaSilva is also with Virginia Tech, VA, USA. G. Bi is with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore (email: [email protected]). D. Niyato is with the School of Computer Engineering, Nanyang Technological University, Singapore (email: [email protected]). This material is based upon works partially supported by the Science Foundation Ireland under Grant No. 10/IN.1/I3007. and space) can be further exploited within the tolera- ble interference level of PUs. For example, in temporal spectrum sharing (TSS) (also called spectrum underlay [1], dynamic/opportunistic spectrum access [2]) based CR networks, SUs are allowed to send signals in the licensed spectrum when the PUs are absent. In this network, each SU needs to continuously sense the availability of PUs in all the available sub-bands. In a TSS-based CR network, SUs are assumed to make a binary decision on the presence of the PUs. Spatial spectrum sharing (SSS) [3]–[6] (or spectrum overlay [1], dynamic spectrum sharing [7]) allows the PUs to tolerate a small increase of interference power caused by SU networks. In this system, the SUs can transmit signals at the same time as the PUs but are required to control their transmit powers to ensure that the resulting interference power at each PU is below an acceptable level [8]. In this paper, we consider SSS-based CR networks. Recent results show that if the SUs can cooperate with each other during spectrum sharing, both the transmission rate and reliability can be greatly improved [5], [9]. Fur- thermore, wireless technologies, such as LTE+ and IEEE 802.16, increasingly make use of OFDMA to support multiple users. This motivates the study of OFDMA-based CR networks [10] [11]. When studying a CR network, it is important to take into account the interaction between different autonomous decision maker. This motivates the use of game theory to analyze resource management in wireless networks. More specifically, a dynamic non-cooperative game theoretic model has been applied to solve the distributed power control problem for SSS-based CR networks in [6]. In [7], a non-cooperative game approach has been applied to study the spectrum sharing problem for CR networks. Coalitional game-based methods have been used to investigate rate allocation problems for multiple access channels [12] and the stability of user cooperation for wireless networks [13], [14]. The Stackelberg game has been proposed to study the interactions between users that have different levels of control over or information about the spectrum [15]–[22]. For example, in [15], the PU and SUs have been modeled as the leader and the followers, respectively, to study how the PU can use a pricing function to manage the spectrum usage by the SUs. In [20], the interactions between the source and relays in user cooperation networks has been investigated. In [17], the authors study the distributed power allocation problem for a one-PU multiple-SUs network. A distributed algorithm has been proposed which exhibits higher power

Transcript of A Hierarchical Game Theoretic Framework for Cognitive...

Page 1: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

1

A Hierarchical Game Theoretic Framework forCognitive Radio Networks

Yong Xiao, Guoan Bi, Senior Member, IEEE, Dusit Niyato, Member, IEEE, and Luiz A. DaSilva , SeniorMember, IEEE

Abstract—We consider OFDMA-based cognitive radio (CR)networks where multiple secondary users (SUs) compete forthe available sub-bands in the spectrum of multiple primaryusers (PUs). We focus on maximizing the payoff of both SUsand PUs by jointly optimizing transmit powers of SUs, sub-band allocations of SUs, and the prices charged by PUs. Tofurther improve the performance of SUs, we allow SUs whoshare the same sub-band to cooperate with each other to sendand receive signals. To help us understand the interactionamong SUs and PUs, we study the proposed network modelfrom a game theoretic perspective. More specifically, we firstformulate a coalition formation game to study the sub-bandallocation problem of SUs and then integrate the coalitionformation game into a Stackelberg game-based hierarchicalframework. We propose a simple distributed algorithm forSUs to search for the optimal sub-bands. We prove that thetransmit power and sub-band allocation of SUs and the pricecharged by PUs are interrelated by the pricing function ofPUs. This makes the joint optimization possible. More impor-tantly, we prove that if the pricing coefficients of PUs have afixed linear relationship, the sub-band allocation of SUs willbe stable and the Stackelberg equilibrium of the hierarchicalgame framework will be unique and optimal. We proposea simple distributed algorithm to achieve the Stackelbergequilibrium of the hierarchical game. Our proposed algorithmdoes not require SUs to know the interference temperaturelimit of each PU, and has low communication overheadsbetween SUs and PUs.

Index Terms—Cognitive radio, power control, sub-bandallocation, price adjustment, spatial spectrum sharing, gametheory, coalitional game, Stackelberg equilibrium.

I. INTRODUCTION

By allowing the mobile device to flexibly adapt itsoperation to the surrounding environment, cognitive radio(CR) networks have the potential to solve the spectrumover-subscription/under-utilization problem in wireless net-works. In a CR network, secondary users (SUs) can accessthe licensed spectrum allocated to primary users (PUs) ifit is unoccupied or under-utilized. The main challenge forthis network is that SUs need to intelligently decide whatand how much licensed resources (e.g., time, spectrum

Manuscript received January 5, 2012; revised May 16, 2012; acceptedJuly 2, 2012.

Y. Xiao and L. A. DaSilva are with CTVR, Trinity College Dublin, Ire-land (email: [email protected] and [email protected]). L. A. DaSilvais also with Virginia Tech, VA, USA.

G. Bi is with the School of Electrical and Electronic Engineering,Nanyang Technological University, Singapore (email: [email protected]).

D. Niyato is with the School of Computer Engineering, NanyangTechnological University, Singapore (email: [email protected]).

This material is based upon works partially supported by the ScienceFoundation Ireland under Grant No. 10/IN.1/I3007.

and space) can be further exploited within the tolera-ble interference level of PUs. For example, in temporalspectrum sharing (TSS) (also called spectrum underlay[1], dynamic/opportunistic spectrum access [2]) based CRnetworks, SUs are allowed to send signals in the licensedspectrum when the PUs are absent. In this network, each SUneeds to continuously sense the availability of PUs in all theavailable sub-bands. In a TSS-based CR network, SUs areassumed to make a binary decision on the presence of thePUs. Spatial spectrum sharing (SSS) [3]–[6] (or spectrumoverlay [1], dynamic spectrum sharing [7]) allows the PUsto tolerate a small increase of interference power caused bySU networks. In this system, the SUs can transmit signals atthe same time as the PUs but are required to control theirtransmit powers to ensure that the resulting interferencepower at each PU is below an acceptable level [8].

In this paper, we consider SSS-based CR networks.Recent results show that if the SUs can cooperate witheach other during spectrum sharing, both the transmissionrate and reliability can be greatly improved [5], [9]. Fur-thermore, wireless technologies, such as LTE+ and IEEE802.16, increasingly make use of OFDMA to supportmultiple users. This motivates the study of OFDMA-basedCR networks [10] [11].

When studying a CR network, it is important to takeinto account the interaction between different autonomousdecision maker. This motivates the use of game theory toanalyze resource management in wireless networks. Morespecifically, a dynamic non-cooperative game theoreticmodel has been applied to solve the distributed powercontrol problem for SSS-based CR networks in [6]. In [7],a non-cooperative game approach has been applied to studythe spectrum sharing problem for CR networks. Coalitionalgame-based methods have been used to investigate rateallocation problems for multiple access channels [12] andthe stability of user cooperation for wireless networks [13],[14]. The Stackelberg game has been proposed to studythe interactions between users that have different levels ofcontrol over or information about the spectrum [15]–[22].For example, in [15], the PU and SUs have been modeled asthe leader and the followers, respectively, to study how thePU can use a pricing function to manage the spectrum usageby the SUs. In [20], the interactions between the source andrelays in user cooperation networks has been investigated.In [17], the authors study the distributed power allocationproblem for a one-PU multiple-SUs network. A distributedalgorithm has been proposed which exhibits higher power

Page 2: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

2

efficiency than the traditional iterative water-filling method.[16] introduced a heuristic algorithm to decide the transmitpower and transmission frequencies of SUs, and usedsimulation to show the convergence of the algorithm. Thepricing policy of the service providers has been studiedfor a one-PU multiple-SU CR network in [18]. It showsthat to maximize the revenue, the service provider shouldcharge higher prices to the users that have better channelconditions and more willingness to pay for the providedservice. In [19], the authors study the interactions betweenthe myopic users and foresighted users in an interferencechannel. They show that the Stackelberg equilibrium (SE)is generally difficult to compute, and hence a Lagrangiandual-based approach has been proposed to iteratively searchfor the SE.

In this paper, we consider an OFDMA-based CR networkmodel in which the SUs distributedly compete with eachother in the spectrum owned by multiple PUs. We investi-gate three types of interactions in this CR network. The firstone is the interaction among PUs. By assuming each PUcan charge a certain price to the SUs for accessing its spec-trum, we study the effects of different pricing competitionstrategies on the payoff of the PU networks. The second oneis the interaction among SUs. We study how the SUs cancompete for the limited number of sub-bands owned by thePUs to maximize the SUs’ performance. To further improvethe efficiency of the SU network, we allow multiple SUsto share the same sub-band and cooperate with one anotherto send and receive signals. This greatly increases thecomplexity of the network optimization problem becausein this case we need to consider the network externality,i.e., the payoff of each SU will be affected by the actions(the sub-band choice and the transmit power) of other SUs.Previous studies often simplify this by assuming the payoffof each SU is an increasing (positive network externality[23]) or decreasing (negative network externality [24])function of the number of cooperative members. However,in this paper, we consider a general model and do nothave such assumptions. The third one is the interactionbetween SUs and PUs. In this paper, we assume PUs canadjust their prices to improve their payoffs. This change ofprices directly affects the optimization problem of the SUnetworks. Hence, we also study the effects of the pricingadjustment by PUs on the performance of SU networks.We assume the PUs do not have to know any informationabout the SU network (e.g., the number of SUs and how thespectrum is divided among them), and cannot send specificinformation or instructions to any of the SUs.

The main focus of this paper is to study the optimizationproblem for an OFDMA-based cooperative CR network.More specifically, we study how to improve the perfor-mance of both SU and PU networks by jointly optimizingthree parameters: the transmit powers of SUs, the sub-band allocations of SUs, and the prices charged by PUs.Simultaneously optimizing these three parameters is ingeneral complex, and hence most previous works in theliterature only focus on the optimization of one or twoof these parameters. In this paper, we show that it is

possible to simultaneously optimize all these parameters.More specifically, we establish a coalition formation game[13] to study the cooperation/competition among SUs inthe sub-bands. We show that the grand coalition of ourcoalition formation game is always unstable under certainconditions and hence how to let the SUs to distributedlyform different coalitions in each sub-band is a challengingproblem. We then adapt the coalition formation game into aStackelberg game-based hierarchical framework to jointlyoptimize the transmit powers and sub-bands selected bySUs and the pricing functions of PUs. In our framework,SUs optimize their transmit powers and compete for thesub-bands to maximize their data rates and simultaneouslyminimize the prices paid to PUs, while each PU adjusts thepricing function to optimize its revenue and simultaneouslylimit the interference caused by SUs. We derive the Nashequilibrium (NE) for the power control game of SUs and theprice adjustment game of PUs. We also present the SE ofthe hierarchical game. We observe that the SE changes withthe sub-band allocation scheme of SUs and hence focus onthe sub-band allocation problem for SUs. We propose asimple distributed algorithm for SUs to search for the NE-achieving sub-band allocation scheme. More importantly,we prove that if the pricing coefficients of all PUs satisfya fixed linear relationship, the optimal sub-band allocationscheme for SUs is unique and stable. Furthermore, a uniqueSE can always be found for the hierarchical game. Wepropose a simple distributed algorithm to achieve the SE.Comparing to the previously reported results [3], [4], [9],our proposed algorithm does not require SUs to know theinterference temperature limit of each PU and the numberof iterations to reach equilibrium is unrelated to the numberof SUs.

The rest of this paper is organized as follows. Thenetwork model and problem formulation are discussed inSection II. Game theoretic analysis of the proposed modelis presented in Section III. The joint optimization algorithmis presented in Section IV. Numerical results are presentedin Section V. We summarize the work and offer concludingremarks in Section VI.

II. NETWORK MODEL AND PROBLEM FORMULATION

In this section, we first describe the network model andthen formulate the game theoretic model for the OFDMA-based CR network. Finally, we provide formal mathemat-ical descriptions of the three main problems and relatedgames studied in this paper. We will present the solutionfor each of these problems in the next section.

A. Network Model

Consider a CR network model (see Figure 1) in whichthere are J PUs, P1, P2, . . . , PJ , and K secondary source-to-destination pairs, S1 to D1, S2 to D2, . . . , and SK toDK . We assume that PUs and SUs do not have any a prioriknowledge of each other’s spectrum utilization. SUs dividethe licensed spectrum into M sub-bands and each SU canonly access one sub-band at a time. The sub-band division

Page 3: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

3

P1

...

S1

S2

SK

Primary User (PU)

Networks

Secondary User (SU)

Networks

K K-1 K-2 … 3 2 1

Spectrum of PUs (FD-SSS)

P2

PJ

...

Fig. 1. Network model for OFDMA-based CR networks.

scheme is decided by the SU network and is unknown toPUs.

Let lSk∈ 1, 2, . . . ,M denote the sub-band allocated

to Sk. A sub-band allocation scheme for SUs is denoted aslS = [lS1 , lS2 , . . . , lSK ]. In our setting, multiple SUs cansimultaneously access the same sub-band. Let the set ofSUs who share the same sub-band m be Lm, i.e., Lm =Sk : lSk

= m, k = 1, 2, . . . ,K. Therefore, Lm = ∅ ifno SUs use sub-band m, Lm = Sk if Sk is the only SUto access sub-band m, and |Lm| ≥ 2 if two or more SUsshare the same sub-band m. Let the channel gain from Sk

to Pj in sub-band l be hjk[l]. If it is clear from the contextthat a specific sub-band allocation scheme has been fixed,we drop the label l and use the notation hjk to denote thechannel gain between Sk and Pj .

In this paper, we assume that the transmission of eachPU spans the entire licensed spectrum and hence each PUis affected by the transmission of all K SUs. This modelsubsumes some of the previously studied scenarios in theliterature. For example, if we let h1k = 0 and hjk = 0 forall k ∈ 1, 2, . . . ,K and j ∈ 2, 3, . . . , J, the systemmodel under consideration is equivalent to the case ofone PU (i.e., PU P1) with equal utilization efficiency inevery sub-band of the licensed spectrum [25]. If we letJ = K and hjk = 0 for j = k, the system model underconsideration becomes the sub-band competition problemdiscussed in [26].

Let the transmit power of Sk be wSk, and the

transmit power vector of the K SUs be wS =[wS1 , wS2 , . . . , wSK ]T . The channel gains between SUs and

PUs can be expressed in matrix form, i.e.,

H =

|h11[lS1 ]|2 |h12[lS2 ]|2 . . . |h1K [lSK

]|2|h21[lS1 ]|2 |h22[lS2 ]|2 . . . |h2K [lSK ]|2

......

. . ....

|hJ1[lS1 ]|2 |hJ2[lS2 ]|2 . . . |hJK [lSK]|2

. (1)

Let the maximum interference power that can be tolerat-ed by Pj be qPj for j ∈ 1, 2, . . . , J. We require that thefollowing power constraint

HwS ≤ qP (2)

holds for qP = [qP1, qP2

, . . . , qPJ]T .

In our model, the secondary sources and destinationssharing the same sub-band are allowed to cooperate witheach other to send and receive signals to avoid crossinterference. We assume that channel gains among thesources or among the destinations are much higher thanthose between the sources and destinations and hence thecooperative transmitting and receiving of multiple SUs inone sub-band m can be modeled as a virtual |Lm|-input|Lm|-output MIMO channel [13]. In addition, the SUs canalso use optimal power control methods to maximize theirbenefits and simultaneously minimize the total price paid toPUs. Note that our model can be directly applied to othercooperative modes, e.g., transmitter or receiver cooperation.Hence, for completeness, we present a brief discussion onhow to make these extensions in Appendix A.

B. Hierarchical Game Setup

Let us consider a hierarchical game in which the playersof the game are the PUs (leaders), who have priority inusing the spectrum, and the SUs (followers), who canaccess the licensed spectrum by paying a certain “price”.Price is introduced and used by the PUs to regulate, in adistributed and indirect manner, the transmit power and thesub-band allocation of the SUs, so as to achieve an optimaltrade-off between spectrum utilization and the interferenceat the PUs.

We follow the standard setup [6], [11]–[15] to definethe benefit of each SU Sk to be equal to its data rate.In particular, for a given sub-band allocation scheme lS ,the benefit to the SU Sk, k = 1, . . . ,K, is given byrSk

[lSk] = RSk

(lS ,FSk) where RSk

is the data rate ofSk, and FSk

is the fairness criterion agreed to by Sk

when accessing sub-band lSk. If Sk is the only SU to use

the sub-band lSk, we let FSk

= ∅, and RSk(lS ,FSk

) =RSk

(lSk) = log(1 + |gkk[lSk

]|2wSk) is the channel ca-

pacity between the kth source-to-destination pair, where|gkk|2 =

|gkk[lSk]|2

σ2Sk

+σ′2Sk

, gkk[lSk] is the channel gain between

the kth secondary source-to-destination pair, σSkand σ′

Sk

are the additive interference (treated as noise) to Sk causedby the transmission other PUs and SUs, respectively. Ifmore than two SUs use the sub-band m, RSk

(lS ,Fm) isthe transmission rate being allocated to the kth source-to-destination pair for Sk ∈ Lm. To capture the difference ofthe contributions to the payoff sum brought by different SUs

Page 4: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

4

in a coalition, in this paper, we consider the proportionalfairness (PF) criterion [27], i.e., Fm = PF. Anotherreason for considering this fairness criterion is that, asobserved in [11], [27], [28], PF corresponds to the Nashbargaining solution [29, Chapter 8] if the minimum payoffrequired by each SU equals zero.

Since the main effect of SUs on the PU network isinterference, let us define the price charged by each PU Pj

to the SUs to be proportional to the resulting interferenceobserved by Pj , i.e., the cost to SU Sk is given bycSk

= βTPh•kwSk

where βP = [βP1 , βP2 , . . . , βPJ ]T

and βPj is the pricing coefficient of Pj and h•k =[|h1k|2, |h2k|2, . . . , |hJk|2]T is the kth column of H . Wedefine the payoff of Sk to be the difference between itsbenefit and cost, which is given by

πSk(wSk

,βP , lS ,FSk) = rSk

− cSk. (3)

Before the transmission of each SU, each PU Pj needsto define its pricing coefficient βPj and broadcast βPj toall the SUs. Then, each SU Sk needs to decode h•k[lSk

]and compare the prices of different PUs to decide in whichsub-band to transmit its signals. Each SU Sk can use thedecoded βPj to estimate the channel gain from the Pj toSk and estimate hjk[lSk

] accordingly1. To simplify our de-scription, we assume each SU Sk can always reliably obtainthe price cSk

charged by all PUs. Note that SUs do not needto communicate with each other or know the interferencetemperature limit qPk

of each Pk, k = 1, . . . , J .Let us define the payoff of each PU as the sum of all

prices paid by the SUs and hence we can express the payoffof Pj , j = 1, . . . , J , as [17], [18]

πPj (wS , βPj , lS) = βPjhj•wS − φPj , (4)

where φPj is the received interference caused by PUnetworks, hj• = [|hj1|2, |hj2|2, . . . , |hjK |2] is the jth rowof the channel gain matrix H in (1). In this paper, weassume φPj

can be regarded as a constant and hence, tosimplify our notation, can be moved to the left hand sideof the above equation, i.e., we denote the payoff of Pj as

πPj (wS , βPj , lS) = πPj (wS , βPj , lS) + φPj

= βPjhj•wS . (5)

Note that the pricing coefficient βPj of Pj quantifies thewillingness of Pj to sell its licensed spectrum. For example,if the channel gain between SUs and PUs is small or SUsare far from Pj , βPj can be set to a low value to make thelicensed spectrum affordable to more SUs. However, if Pj ismore sensitive to interference, or the channel gain betweenSUs and PUs is large, or SUs are close to Pj , Pj can set arelatively high βPj . To calculate its payoff function, eachPU only needs to measure its overall received interferencepower. Note that Pj does not need to know whether the SU

1It has been verified in [30], [31] that the channel gain between the PUsand SUs is the same in both forward and backward directions. Hence, itis reasonable to assume each SU can estimate hjk[lSk

] from the decodedβPj

sent by Pj for j = 1, 2, . . . , J .

is in a coalition or is the only occupant of a given sub-band,because Pj applies the same pricing coefficient to all SUs.

To simplify the notation, in the rest of this paper, wewill drop the term lS in the payoff functions, and writeπSk

(wSk,βP ,FSk

) and πPj (wS , βPj ) instead when it isclear from the context that the sub-band allocation is pre-defined.

In this paper, we seek a balanced operating point forthe SU networks, known as the Nash equilibrium (NE)[32, Definition 23.1]. In Section III, we will show thatthe optimization problems for both PU and SU networksare linked to one another via the pricing function of PUs.Each PU selfishly attempts to maximize its own payoff,resulting in competition among the PUs. Also, SUs try tooptimize their transmission parameters for the given pricesof PUs. In this paper, we seek an equilibrium point of thishierarchical game, from which both SUs and PUs have noincentive to deviate. This equilibrium point is known as theStackelberg equilibrium (SE) [33, Definition 3.27], whoseformal definition is given as follows.

Definition 1. [33, Definition 3.27] For a fixed lS , thepricing coefficients β∗

P = [β∗P1

, β∗P2

, . . ., β∗PJ

]T , andthe transmit powers w∗

S = [w∗S1(β∗

P ), w∗S2(β∗

P ), . . .,w∗

SK(β∗

P )], form an SE if the interference power limit (2)is satisfied, and for each j ∈ 1, . . . , J, we have

β∗Pj

= arg maxβPj

≥0πPj

(w∗

S(βPj , β∗−Pj

), βPj ,β∗−Pj

|lS), (6)

where for any given βP ,

w∗Sk(βP ) = arg max

wSk≥0

πSk

(wSk

,w∗−Sk

βP ,FSk|lS). (7)

C. Problem Formulation

This paper considers the following problems.1) Sub-band Allocation/Coalition Formation Problem

for SUs: We assume that each SU can only choose onesub-band. Hence, each SU can either exclusively occupy asub-band or share the same sub-band with other SUs. Themain objective of each SU is to maximize its payoff bysolving the following optimization problem,

maxlSk

∈1,2,...,MπSk

(w,βP , lSk

, l∗−Sk,FSk

). (8)

In this case, the sub-band allocation problem can bemodeled as a coalition formation game in which more thanone SU form a coalition to share a sub-band if this couldmaximize their payoff. Let us give a formal definition asfollows.

Definition 2. [29, Chapter 9] A coalition C is a non-emptysub-set of the set of all players K, i.e., C ⊆ K. We referto the coalition of all the players as the grand coalitionK. A coalitional game is defined by the pair (K, ν) whereν is called the characteristic function, which assigns anumber ν (C) to every coalition C and ν (∅) = 0. Hereν (C) quantifies the worth of a coalition C. A coalitionalgame is said to be super-additive if for any two disjoint

Page 5: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

5

coalitions C1 and C2, C1 ∩ C2 = ∅ and C1, C2 ⊂ K, wehave

ν(C1 ∪ C2

)≥ ν

(C1)+ ν

(C2). (9)

In our setting, Sk chooses not to enter into any coalitionif it can maximize its payoff by exclusively occupying asub-band m. However, if an SU Sk can obtain the highestpayoff by sharing a sub-band with other SUs according toa proportionally fair allocation of data rates, the SU Sk willnegotiate with other SUs in the sub-band lSk

to obtain itspayoff. We will provide more detailed discussion on thecoalition formation process and the fairness criterion in thenext section.

We have the following definitions about the stability ofa coalition.

Definition 3. Let us define a payoff vector for SUs tobe any vector π = (πSk

)Sk∈K in RK to divide thevalue ν (K). π is said to be group rational or efficient if∑K

k=1 πSk= ν(K) and is said to be individually rational if

πSk≥ ν(Sk),∀Sk ∈ K. An imputation is a payoff vector

satisfying both group and individual rationality.

Definition 4. An imputation π is said to be unstable througha coalition C if ν (C) >

∑Sk∈C

πSk. The core of ν (K) is

defined as the set of stable imputations. That is, π is in thecore if and only if∑Sk∈K

πSk= ν (K) and

∑Sk∈C

πSk≥ ν (C) , ∀ C ⊆ K. (10)

2) Power Control Problem for SUs: In our hierarchi-cal game framework, the PUs have priority in using thespectrum and set the price and the SUs optimize theirtransmit powers based on the price set by the PUs. Fora given price and fixed sub-band allocation scheme, let usmodel the power control game for SUs as follows. For agiven pricing coefficient βP , each SU searches for an NE-achieving transmit power for which this SU cannot furtherimprove its payoff by choosing a different transmit power,given the transmit power of other SUs, i.e., Sk solves thefollowing optimization problem

maxwSk

≥0πSk

(wSk

,w∗−Sk

,βP ,FSk|lS). (11)

Note that the optimal transmit power of each SU isclosely dependent on the sub-band allocation scheme. Wewill provide a more detailed discussion in the next section.

3) Price Adjustment Problem for PUs: We model theprice adjustment problem as a game among PUs. The NEin this game can be obtained by allowing each PU Pj tosolve the following optimization problem given the pricecharged by other PUs,

maxβPj

≥0πPj

(wS , βPj ,β

∗−Pj

|lS), (12)

s.t. HwS ≤ qP .

Note that the payoff of each PU depends on the transmitpower and the sub-band allocation of SUs and the pricingcoefficient of PUs. To study the joint optimization of

Price Adjustment Game for PUs

Sub-band Allocation Game for SUs

Power Control Game for SUs

Interference

(Proposition 4)

[ ]kSv m

Pricing

coefficient

Stackelberg Game

Coalition Formation Game

(Proposition 5)c

Fig. 2. The relationship of different games in the hierarchical gametheoretic framework.

the transmit power of SUs and the pricing function ofother PUs, we formulate a Stackelberg game in whichSUs are regarded as followers who maximize their payoffsby choosing their optimal transmission parameters, andPUs are leaders who can adjust the price of the licensedspectrum to maximize their payoffs. To study the sub-bandallocation scheme, we formulate a coalition formation gamefor SUs as discussed before. Our main objective is to finddistributed methods to simultaneously approach the NEsof the games in problems 1) - 3) and achieve the SE ofthe hierarchical framework. We illustrate the relationshipof these different games in Figure 2.

III. GAME THEORETIC ANALYSIS

In this section, we derive the solutions of the problemsin Section II-C. We will show that these problems can beconnected by the pricing function of PUs. We present themain result in Theorem 1 at the end of this section. Theorem1 states that if the pricing coefficients of PUs satisfy a fixedlinear relationship, the sub-band allocation of SUs is stableand unique and the SE is also unique and optimal.

A. Distributed Sub-band Allocation Solution for SUsIn this section, we consider the sub-band allocation

problem for SUs. Since we assume that the SUs in sub-bandm can be regarded as an |Lm|-input |Lm|-output MIMOchannel, following the same line as [34] and using X and Yto denote the input and output of the channels, respectively,we can write the capacity sum of all SUs in sub-band mas follows:∑

Sk∈Lm

RSk= I

(XSk

Sk∈Lm; YDk

Dk∈Lm

)=

∑Sk∈Lm

∑Dl∈Lm

I (XSk;YDl

)

=∑

Sk∈Lm

log (1 + λSk[m]wSk

). (13)

where I(·, ·) is the mutual information, and λSk[m] is the

kth nonzero eigenvalue of a matrix GSk∈LmGTSk∈Lm

and GSk∈Lm is a sub-matrix of the channel gain matrixG of the SU network, defined as

G =

g11[lS1 ] g12[lS2 ] . . . g1K [lSK ]g21[lS1 ] g22[lS2 ] . . . g2K [lSK

]...

.... . .

...gK1[lS1 ] gK2[lS2 ] . . . gKK [lSK ]

. (14)

Page 6: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

6

Here, gjk[lSk] is the ratio of the channel gain between

source Sj and destination Dk to the received interferencepower at Dk in sub-band lSk

. In this paper, we assume thePUs cannot know the coalitions formed by the SUs but canonly charge each SU based on its resulting interference.The payoff allocated to each member equals the allocatedbenefit minus the price paid to the PUs. We assume eachmember SU in a coalition can use an optimal power controlmethod to maximize its allocated payoff. For example, ifwe consider the benefit allocated to the member SU Sk ina coalition Lm to be equal to the rate contributed to theoverall capacity sum, i.e.,

RSk= log (1 + λSk

[m]wSk) , (15)

the optimal power of Sk is given by w∗Sk

=(1

βTPh•k

− 1λSk

[m]

)+. We will discuss the optimal power

control method of SUs in Section III-B. Note that in w∗Sk

,βTPh•k is defined by the price charged by PUs and the

channel gains between SUs and PUs, both of which areoutside the control of SUs. Therefore, the SUs in a coalitioncan only divide the payoff sum by adjusting the value ofλSk

[m]. In other words, the payoff division among SUs inthe coalition Lm is equivalent to deciding the optimal valueof λSk

for each of the cooperative SUs. Specifically, findingthe optimal wSk

to maximize the payoff sum is equivalentto deciding the optimal λSk

[m] to maximize∏

Sk∈Lm

πSk[m]

for all Sk ∈ Lm.We then have the following result.

Proposition 1. For a given coalition Lm, the followingpayoff division scheme satisfies the proportional fairnesscriterion. The payoff of each member SU Sk in the coalitionLm is given by

πSk[m] = rSk

[m]− cSk[m]

= log (1 + λSk[m]wSk

)− βTPh•kwSk

∀Sk ∈ Lm. (16)

Proof: See Appendix B.Note that, in a virtual MIMO system, the

mutual information among the sources or amongthe destinations is always much higher than thatbetween the sources and destinations, i.e., I(XSj ;YSk

)≥ minI(XSj , XSk

;YSk), I(XSj , XSk

;YSj ) andI(XDj ;YDk

) ≥ minI(XSj ;YSj , YSk), I(XSk

;YSk;YSj )

∀Sj , Sk ∈ Lm. In other words, there is high probabilitythat two cooperative sources Sj and Sk are locatedrelatively close to each other compared to their distancesto all the destinations. More specifically, if the channelgains between these two sources and all the destinationsare strongly correlated to each other, the eigenvaluecorresponding to the linearly dependent entries in GGT

will approach zero. This will cause the low-payoff SUsto leave the coalition and search for other sub-bands. Inaddition, as observed in [14], in a user cooperation system,relaying among SUs may induce a cooperation cost whichwill further discourage the SUs from joining a coalition. In

Sensing Negotiating Data Transmission

Time

Fig. 3. Timing structure of the Algorithm.

other words, the core of the sub-band allocation game isalways empty if two sources (or destinations) are close toeach other or the channels connecting at least two sourcesor destinations have similar channel gains.

In the following, we propose a simple sub-band coali-tion formation algorithm. Before we present the detaileddescription, let us briefly discuss the timing structure of thespectrum sensing process of our algorithm. We assume thatthe entire transmission process of the SUs is divided intothree periods: sensing, negotiating, and data transmissionperiods. In the sensing period, each SU determines itspayoff in all the sub-bands as if it is the only SU to usethese sub-bands. If more than one SU chooses the samesub-band to maximize its payoff, they will negotiate thepayoff division during the negotiating period. The timingstructure is presented in Figure 3. Note that the length ofthe negotiating period depends on the number of SUs in thecoalition. Assume βP is fixed and the detailed algorithmis presented as follows.

Algorithm 1: Sub-band Coalition Formation Algo-rithm

1) Sensing:a) The SUs, after receiving the prices

from PUs, sequentially send a shorttraining message to estimate their payoffin all the sub-bands, i.e., Sk knowsπSk

(lSk= 1|F1 = ∅) , πSk

(lSk= 2|F2 = ∅) ,

. . . , πSk(lSk

= M |FM = ∅) whereπSk

(lSk= m|Fm = ∅) is the payoff of Sk

when sub-band m is exclusively used by Sk,b) Each Sk broadcasts the sub-band l∗Sk

that wouldmaximize its payoff, i.e.,

l∗Sk= arg max

l∈1,...,MπSk

(lSk

= l|FlSk= ∅)

(17)

Let R∗ = l∗Sk: Sk ∈ 1, 2, . . . ,K.

2) Negotiating:a) All the active SUs need to negotiate with

each other on each of the sub-bands in R∗

to obtain the possible payoff division schemesπSk

(lSk

= l, l−Sk|FlSk

= ∅), ∀l ∈ R∗. After

the negotiation process, each SU chooses the sub-band in R∗ which provides the highest payoffdivision πSk

(l∗Sk, l∗−Sk

|FlSk),

b) Sk compares πSk(l∗Sk

, l∗−Sk|FlSk

) with its payoffsin the sub-bands outside of R∗. If

πSk(l∗Sk

, l∗−Sk|FlSk

) (18)

< maxlSk

∈1,...,M\R∗πSk

(lSk

|FlSk= ∅),

Page 7: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

7

Sk updates its optimal sub-band as the sub-bandl∗Sk

such that

l∗Sk= argmax

lSk∈1,...,M\R∗

πSk

(lSk

|FlSk= ∅). (19)

If l∗Skis shared by other SUs, set R∗ = R∗∪l∗Sk

and go to Step 2-A). Step 2) is repeated until noSUs want to change their sub-bands.

Let us prove in the following proposition that no SUsor partitions of SUs have an incentive to deviate from thecurrent partition when Algorithm 1 stops and hence thepartition achieved by Algorithm 1 is stable.

Proposition 2. For the given pricing function βP andfairness criterion, the resulting partition of Algorithm 1 isstable.

Proof: See Appendix C.

B. Distributed Optimal Power Control Solutions for SUs

Consider the power control problem defined in (11). Fornow let us assume that the sub-band allocation scheme lShas been fixed and the pricing coefficients βP are knownto all SUs.

If Sk is the only SU who uses the sub-band m, thefollowing optimal transmit power for Sk can be obtainedby solving (11).

w∗Sk

(βP |Lm = Sk) =(

1

uSk(βP )

− 1

vSk

)+

, (20)

where (x)+ = maxx, 0, uSk(βP ) = βT

Ph•k and vSk=

vSk[lSk

] = |gkk[lSk]|2. The quantity 1/uSk

(βP ) is typicallycalled the water-level of wSk

, and Sk should not transmitif 1/vSk

is higher than this level.If multiple SUs share the sub-band m, they will form

a coalition to send and receive signals. As mentioned inSection II, in our model, sources and destinations withinthe sub-band m cooperate with each other by forming avirtual MIMO system. Our results can be regarded as theupper bound for either the source cooperation or destinationcooperation case. The optimal transmit power of othercooperative modes is discussed in Appendix A.

Assume that every member SU in coalition Lm tries tomaximize the rate sum and simultaneously minimize itsallocated cost. Using the same method as [4], we can derivethe optimal transmit power for Sk as follows:

w∗Sk

(βP ||Lm| ≥ 2) =

(1

uSk(βP )

− 1

λSk[m]

)+

. (21)

To simplify our description, let us writew∗

Sk(βP ||Lm| ≥ 2) in the same form as (20), with

vSk[m] =

|gkk[m]|2, if Lm = Sk ,λSk

[m], if |Lm| ≥ 2.(22)

Since w∗Sk(βP ) maximizes the payoff of Sk for the given

lS and βP , we can claim that w∗Sk(βP ) is an NE for the

power control game of SUs. It is observed in (20) that notall SUs can use the licensed spectrum because the optimaltransmit powers of some SUs may be equal to zero.

The main difference between (20) and the results in [4]is that we do not include the power constraint (2) in theSU optimization (there is no qP in (20)). Instead, we letthe PUs set the price as well as the pricing coefficients βP

and influence the transmit power w∗Sk(βP ) of each SU so

as to ensure the interference power constraint qP in (2)is satisfied (cf. Section III-C). In other words, each SU in[4] is required to know the interference temperature limitqPj of each PU as well as the transmit powers of otherSUs to calculate the water-level. However, (20) and (21)do not require such information. Therefore, our formulationreduces the information requirements of the SUs.

Note that SU Sk cannot transmit if the licensed spectrumis “pricier” than what Sk can afford. This could be due tothe close location of Sk to some PUs, which prevents theSUs from using licensed spectrum. We say that Sk is activeif w∗

Sk(βP ) > 0. We have the following result.

Proposition 3. For a given βP , the SU Sk is activeif and only if uSk

(βP ) < vSk. Furthermore, we have

πSk(w∗

Sk,βP ) ≥ 0 for all βP .

Proof: See Appendix D.

C. Distributed Price Adjustment Solutions for PUs

In the previous section, we showed that for any given βP

and sub-band allocation scheme lS , there exists a uniqueoptimal transmit power for each SU. Assuming that SUsare rational and use the optimal powers given in (20) and(21), the PUs compete with each other to maximize theirpayoffs without violating the power constraint in (2). Wehave the following result about the SE of the hierarchicalgame.

Proposition 4. For a given set of active SUs L ⊂1, 2, . . . ,K and fixed sub-band allocation scheme lS ,either

1) there does not exist a βP such that the interfer-ence power constraint (2) is satisfied for all j ∈1, 2, . . . , J,or

2) there exists a β∗P satisfying the following equation,

maxj∈1,...,J

K∑

k=1

|hjk|2

qPj

(1

uSk(β∗

P )− 1

vSk

)+

= 1

(23)

and (w∗S ,β

∗P ) is a pure strategy SE for the con-

strained game, where w∗S = [w∗

S1(β∗

P ), w∗S2(β∗

P ), . . .,w∗

SK(β∗

P )] and β∗P = [β∗

P1, β∗

P2, . . ., β∗

PJ].

Proof: See Appendix E.It can be observed that the SE of the hierarchical game

is not unique but in general there may exist infinitelymany SEs, i.e., there are infinitely many βP satisfyingthe condition in (23). To simplify our problem, in this

Page 8: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

8

paper, we focus on a linear pricing function for PUs.More specifically, we assume βP = γc where c is a pre-defined vector which expresses the related pricing structureof PUs. The value of c can be regarded as a competitionor negotiation result among PUs. Hence in the rest of thissection, let us study the effects of these different pricingstructures on the payoffs of PUs. As we will prove later,for each given c, the resulting SE is unique. Furthermore,the globally optimal SE of our hierarchical game can beachieved by searching for the optimal c. We have thefollowing observations about this linear pricing model.

Proposition 5. For a given βP , suppose that Sk maximizesits payoff πSk

(w∗Sk(βP ), βP , lS , FSk

) over a given setR of sub-bands with the PF criterion, and that βP = γcwhere c is a vector of fixed constants, and γ is a positivereal number. Then the sub-band allocation scheme in Rthat maximizes πSk

is unique and independent of γ.

Proof: See Appendix F.Proposition 5 shows that if the pricing coefficients βP

are in the form of γc (i.e., βPj increases or decreases ata constant rate), then the NE of the sub-band allocationscheme is unique and does not depend on the value of γ.This allows us to decouple the sub-band allocation problemfrom the price and power control game. In other words, fora fixed c, the SE is unique. In Figure 4, we illustrate dif-ferent sub-band allocation schemes and their correspondingpricing region of two possible cases for a linear pricingCR network with two PUs, two SUs and two sub-bands.It is observed that there is a unique sub-band allocationscheme for each pricing pair (βP1 , βP2). Therefore, theglobal pricing optimization problem can be divided intomultiple sub-problems, each of which corresponds to theoptimization in a pricing region, i.e.,

maxβP

J∑j=1

πPj

(wS ,βP |lS = lΩl

S

)= K −

K∑k=1

uSk(βP )

vSk

[lΩl

Sk

] for lΩl

S ∈ D

s.t.K∑

k=1

|hjk

[lΩl

Sk

]|2

uSk(βP )

≤ qPj +

K∑k=1

1

vSk

[lΩl

Sk

] (24)

where D is the set of all the sub-band allocation schemescorresponding to the pricing regions, i.e., |D| = 2 inFigure 4 (a) and |D| = 6 in Figure 4 (b). It can beshown that the above problem is a convex optimizationproblem, i.e., the objective function is a linear functionof βPj and the subjective function is a convex set whenuSk

(βP ) > 0 ∀ k ∈ 1, 2, . . . ,K. The above problemcan be solved by using the traditional convex optimizationmethods [35].

Let us present the main result of this paper in thefollowing theorem.

Theorem 1. For a given c and a set L of active SUs,the optimal sub-band allocation scheme of SUs is unique,

1Pb

2Pb

[ ] [ ]22 1 1T

P Sl· =β h[ ] [ ]1 111 1T

P g· =β h

[ ] [ ]2 221 1T

P g· =β h

[ ] [ ]2 222 2T

P g· =β h

(a)

[ ] [ ]22 2 2T

P Sl· =β h

[ ] [ ]11 1 1T

P Sl· =β h [ ] [ ]1 112 2T

P g· =β h

[ ] [ ]11 2 2T

P Sl· =β h

1Pb

2Pb

[ ] [ ]22 1 1T

P Sl· =β h

[ ] [ ]1 111 1T

P g· =β h

[ ] [ ]2 221 1T

P g· =β h

[ ] [ ]2 222 2T

P g· =β h

(b)

[ ] [ ]22 2 2T

P Sl· =β h

[ ] [ ]11 1 1T

P Sl· =β h

[ ] [ ]1 112 2T

P g· =β h

[ ] [ ]11 2 2T

P Sl· =β h

1 2

1 2

1 2

0, 1

2, 1

0, 0

S S

S S

S S

l l

l l

l l

= =

= =

= =

1 2

1 2

0, 2

0, 1

S S

S S

l l

l l

= =

= =

1 2

1 2

2, 2

1, 2

S S

S S

l l

l l

= =

= =

1 2

1 2

1 2

2, 1

1, 1

0, 0

S S

S S

S S

l l

l l

l l

= =

= =

= =

Sub-band Allocation Scheme

Sub-band Allocation Scheme

1W

2W

1W

1

2

:

:

W

W

2W

3W4W

5W6W

1

2

:

:

W

W

3

4

:

:

W

W

5

6

:

:

W

W

Fig. 4. Illustration of two possible sub-band allocation schemes for linearpricing CR networks with two PUs (P1 and P2), two SUs (S1 and S2)and two sub-bands (sub-bands 1 and 2): we use different colors to denotedifferent pricing regions

(βP1 , βP2

)∈ Ωl, each of which corresponds to

a different sub-band allocation scheme.

stable and can be obtained by Algorithm 1. The SE of thehierarchical game given by (w∗

S(β∗P ),β

∗P ) is unique and

optimal, where β∗P = γ∗c and γ∗ is given by

γ∗(c) = maxj∈1,2,...,J

∑k∈L

|hjk|2qPj

cT |h•k|2

1 +∑k∈L

|hjk|2qPj

vSk

. (25)

Proof: The first part of the theorem directly comesfrom Proposition 5, and the second part comes from Propo-

Page 9: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

9

sition 4.By substituting γ∗(c) in (25) into (24), we can convert

the joint optimization problems of CR networks into theoptimization problem of the linear pricing coefficients c.

It can be observed from (24) that, to calculate theglobally optimal βP , all PUs will need to jointly derivethe optimal solutions in all the pricing regions, i.e., eachPU needs to know the global information. In addition, thenumber of pricing regions is dramatically increased withthe number of SUs and PUs. Finding a simple distributedalgorithm to achieve the global optimal β∗

P is still an openproblem which will be left for our future work.

Note that, in some systems, the PUs may cheat on theprices to induce SUs to pay more than they should [36].This problem does not exist in our model because theprices charged by PUs are proportional to the resultinginterference caused by SUs. If a PU Pj tries to chargea high price (a large βPj ) to SUs, it will cause SUs todecrease their transmit powers in the licensed spectrum andwill eventually lower the payoff that can be obtained byPUs.

IV. JOINT PRICE ADJUSTMENT, POWER CONTROL, ANDSUB-BAND ALLOCATION ALGORITHM

In this section, we propose a distributed algorithm thatcan reach the NEs of all three problems in Section II-C aswell as an SE of the hierarchical game described in SectionIII-C.

Before presenting the algorithm, let us derive a feasibleregion of the pricing coefficients βP as follows. In ourmodel, PUs do not have a priori knowledge of the exactchannel gains hjk and vSk

. However, we assume that PUscan estimate an approximate range of hjk and vSk

for j ∈1, 2, . . . , J and k ∈ 1, 2, . . . ,K. This can be done byexploiting the common knowledge about the SU network(the maximum allowable transmit powers and the generalinformation about the channel gains of SU networks) anddetecting the highest and lowest received noise levels in thelicensed spectrum.

Proposition 6. Suppose |hjk|2 ≥ h > 0 and vSk≤ v for

all j ∈ 1, 2, . . . , J, k ∈ 1, 2, . . . ,K. Then each PU Pj

only needs to adjust its pricing coefficient βPj within therange of 0 ≤ βPj ≤ β, where β = v

h .

Proof: See Appendix G.The detailed description of the joint optimization algo-

rithm is given below.

Algorithm 2: Distributed Algorithm for OFDMA-based Cooperative CR Networks

Definitions: At iteration t, let- βP (t) = γ(t)c be the pricing coefficient vector,- R(t) be the set of unallocated sub-bands,- L(t) be the set of active SUs,- A(t) be the SUs that join the licensed spectrum

in iteration t, and

- B(t) be the set of SUs that are not allocated anysub-band, but can afford βP (t) in some sub-bandin R(t),

- R∗(t) be the set of the optimal sub-bands thathave already been allocated to active SUs.

Note that the sets L(t) and B(t) are not actuallymaintained by any player, but are included here tofacilitate the description of the algorithm. Each SUthat has not been allocated a sub-band keeps track ofR(t).

1) Initialization:- Set c = [c1, c2, . . . , cJ ]

T with cj ≥ βP , for allj ∈ 1, 2, . . . , J.

- Set γ(0) = 1 and ϵ > 0 to be a small positiveconstant.

- Set R(0) = 1, 2, . . . ,M, L(0) = ∅ and B(0) =∅.

- Set θ to be a small positive number that is knownto all SUs.

2) Price Adjustment:a) At iteration t, PUs update γ(t) = (1− ϵ)γ(t−1),

βP (t) = γ(t)c, and

B(t) =Sk /∈ L(t) : βP (t)

Th•k[l] < vSk[l],

for some l ∈ R(t) . (26)

Each PU Pj broadcasts βPj .b) Each Sk ∈ B(t) only needs to sense the sub-bands

in R(t) and negotiate with other active SUs in thesub-bands l ∈ R∗(t) using Step 2) in Algorithm1 to choose its sub-band l∗Sk

.c) If l∗Sk

∈ R(t), then R∗(t) = R∗(t) ∪ l∗Sk and

all the active SUs repeat Step 2-b) to update theiroptimal sub-bands.

c) If Nj > qPj , Pj broadcasts a “stop” message toall PUs. Go to Step 3).

d) If Nj ≤ qPj ∀j ∈ 1, 2, . . . , J, no message hasbeen broadcast by any PUs. In this case, SUsupdate L(t + 1) = L(t) ∪ A(t). If l∗Sk

/∈ R(t),R(t+1) = R(t), else R(t+1) = R(t)\l∗Sk

∀Sk ∈A(t). Set t = t+ 1 and go to Step 2a).

3) Termination: The algorithm ends with solution γ∗ =γ(t− 1), β∗

P = βP (t− 1), L∗ = L(t− 1) and w∗S =

[w∗S1(β∗

P ), w∗S2(β∗

P ), . . ., w∗SK

(β∗P )] where

w∗Sk

(β∗P ) = (27) αSk

(1

uSk(β∗P )

− 1vSk

), if Sk ∈ L∗

0, otherwise

and sub-band allocation l∗S = [l∗S1, l∗S2

, . . . , l∗SK]

where, for Sk ∈ L∗, l∗Skis determined in Step 2-b).

The main idea of Algorithm 2 is to allow the PUs togradually decrease their pricing coefficients and hence thehigh payoff SUs will always be the first to join licensedspectrum. In this way, the newly active SUs, as well as the

Page 10: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

10

1Pb

2Pb

4 4 44

T

P S S Sl v l· é ù é ù=ë û ë ûβ h

3 3 33

T

P S S Sl v l·é ù é ù=ë û ë ûβ h

2 2 22

T

P S S Sl v l· é ù é ù=ë û ë ûβ h1 1 11

T

P S S Sl v l· é ù é ù=ë û ë ûβ h

(i)

(ii)

(iii)

1

11 S S PqW

· é ù =ë ûh l w

1

22 S S PqW

· é ù =ë ûh l w

1W

2W

3W2

11 S S PqW

· é ù =ë ûh l w

2

22 S S PqW

· é ù =ë ûh l w

3

11 S S PqW

· é ù =ë ûh l w

3

22 S S PqW

· é ù =ë ûh l w

Fig. 5. Illustration of the price decreasing process in Algorithm 2: weconsider the case with two PUs (P1 and P2) and four SUs (S1, S2, S3

and S4). Dashed lines are three price decreasing schemes of Algorithm 2which result in three different sub-band allocation schemes. Note that iflSk

= lSjfor lSk

, lSj∈ 1, . . . , 4 there is no cooperation among SUs.

If lSk= lSj

, Sk and Sj will start a negotiating process as described inStep 2) in Algorithm 1.

possible sub-band negotiation among SUs, will be limitedin each step. The following theorem shows the equilibriumthat is achieved by Algorithm 2.

Theorem 2. Suppose that Algorithm 2 terminates with γ∗

given in (25). Then, we have the following equilibrium.E1) The resulting w∗

S is the NE of the power control gamefor SUs with the given β∗

P = cγ∗ and sub-bandallocation scheme lS ,

E2) The resulting l∗S is the stable point for the sub-bandcoalition formation game,

E3) The resulting β∗P is the NE for the price adjustment

game for PUs, andE4) The resulting (w∗

S , l∗S) is the SE of the power and

pricing game with the given L∗.Furthermore, if Algorithm 2 does not terminate with (25),

the solutions w∗S , l

∗S ,β

∗P and (w∗

S , l∗S) are within a multiple

of (1− ϵ) of the equilibrium points described in E1) - E4)above.

Proof: See Appendix H.Since we assume that PUs have no knowledge of the

channel conditions, i.e., the channel gains between SUsand PUs as well as those among PUs, the price adjustmentscheme which depends on c should be pre-defined. Asobserved previously, different price decreasing schemes ofβP (different values of c) may result in different sub-band allocations, as well as l∗S . In Figure 5, we showthree price adjustment schemes using Algorithm 2 for aCR network with two PUs and four SUs that result inthree corresponding sub-band allocation schemes. Morespecifically, the adjustment scheme (i) in Figure 5 results inS4 and S1 being the first and last SUs, respectively, to jointhe licensed spectrum. On the other hand, in the adjustmentscheme (iii), S2 and S3 are the first and last SUs to jointhe licensed spectrum, respectively.

Another observation in Algorithm 2 is that there exists atradeoff between the convergence speed and the accuracy ofthe final result. More specifically, a smaller value of ϵ will

P1

0 m10m 20 m

1 m

8m 12m

d12

d22

0 m

P1

S1 S2

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

Location of PU 1

Pay

off o

f SU

1

2 4 6 8 10 12 14 16 18 200

1

2

3

4

Pay

off o

f SU

2

Payoff of SU 1Payoff of SU2Payoff of PU 1Payoff of PU 1 obtained from SU 1Payoff of PU 1 obtained from SU 2

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

Location of PU 1

Pay

off o

f SU

1

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

Pay

off o

f SU

2

Payoff of SU 1Payoff of SU 2Payoff of PU 1Payoff of PU 1 obtained from SU 1Payoff of PU 1 obtained from SU 2β

P1

Fig. 6. Payoffs of two SUs and one PU with different distances betweenSUs and PUs.

result in a tighter bound on the possible deviation from theSE solution. However, the number of iterations increasesas ϵ decreases. The following results give a range for thenumber of iterations required for Algorithm 2.

Proposition 7. Suppose h ≤ |hjk|2 ≤ h, αSk≤ α and v ≤

vSk[l] ≤ v for all j ∈ 1, 2, . . . , J, k ∈ 1, 2, . . . ,K,

and l ∈ 1, 2, . . . ,M. Let q = minj∈1,2,...,J

qPj and n be

the number of iterations required by Algorithm 2. Then,

log(hcT1

)− log(v)

log(1− ϵ)−1< n

≤log(hcT1

)+ log

(q

Kαh + 1v

)log(1− ϵ)−1

(28)

where 1 is a length-J vector whose components are all 1’s.

Proof: See Appendix I.

V. NUMERICAL RESULTS

In this section, simulation results are presented to verifythe performance of the proposed algorithm. To study the

Page 11: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

11

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

8

9

10

γ

Num

ber

of a

ctiv

e S

Us

2 4 6 8 10 12 14 16 18 200123456789101112131415161718

Pay

off

Number of active SUsPayoff of PU P

1

Fig. 7. Payoffs of PUs under different γ.

effect of the prices imposed by PUs on the performanceof different SUs, let us first consider the case where twoSUs S1 and S2 share the spectrum of one PU P1. Let usassume that the channel gain hj1 is given by hj1 =

hj1

dξlP1

for j ∈ 1, 2 and l ∈ S1, S2, where hj1 is the averagechannel fading coefficient, and dS1P1 and dS1P2 are thedistances between P1 and S1 and P1 and S2, respectively.ξ is the fading exponent. Assume that both SUs are locatedin a linear network and the PU P1 moves from the leftto the right as shown at the top of Figure 6. The payoffsof both SUs under different PU locations are shown in themiddle of Figure 6, where we assume the pricing coefficientβP1 of P1 is fixed. It is observed that the payoff of eachSU decreases when the PU is close to it. This verifies ourresults in Proposition 4 that the transmit power of each SUis decreased when the channel gain between this SU and thePU becomes higher. Similarly, it is observed that the payoffof the PU P1 is also minimized when the PU approacheseither SU because πP1 decreases with the transmit powersof SUs. At the bottom of Figure 6, we present the payoffs ofthe PU and SUs for a fixed interference temperature limitqP1

at P1. Note that, in this case, the pricing coefficientβPP1 of the PU slightly decreases when P1 approacheseither SU. This is because the optimal power controlmethod forces the SUs to greatly decrease their transmitpowers when the channel gains between SUs and the PUincrease and hence P1 has to decrease its pricing coefficientto counteract the payoff reduction caused by the decreasingtransmit power of SUs.

In Figures 7 – 10, we consider a system with one PUand 10 SUs and study the effects of the pricing coefficientdecreasing process of the PU on the performance of SU andPU networks. Note that for the one PU case, we assumec = cP1 = 1 and βP1 = γ. More specifically, in Figures7 and 8, we show the payoff and the interference powerof the PU and the number of active SUs under differentγ. It is observed that the value of γ controls the payoffand the interference level of P1 as well as the numberof active SUs. Similarly, we can observe from Figure 9

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

8

9

10

γ

Num

ber

of a

ctiv

e S

Us

2 4 6 8 10 12 14 16 18 200123456789101112131415161718

Inte

rfer

ence

Pow

er a

t PU

P1

Number of active SUsInterference Power at PU P

1

Fig. 8. The interference of P1 under different γ.

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

8

9

10

γ

Num

ber

of a

ctiv

e S

Us

2 4 6 8 10 12 14 16 18 200

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

Pay

off

Number of active SUsPayoff sum of SUs

Fig. 9. Payoffs of SUs under different γ.

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

8

9

10

γ

Num

ber

of c

oalit

ions

2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

8

9

10

Num

ber

of c

oalit

ions

Number of active SUsNumber of coalitions

Fig. 10. Coalition formation of SUs under different γ.

Page 12: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

12

S1

S2

S3 Sk...

Sk+1 ...

P1 P2 PJ

5m

5m...

Fig. 11. Simulation for a CR network with J PUs and K SUs.

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

45

50

Num

ber

of C

oalit

ions

or

Act

ive

SU

s

γ

Number of Coalitions for SUs ( c = c1)

Number of Coalitions for SUs ( c = c2)

Number of Coalitions for SUs ( c = c3)

Number of Coalitions for SUs ( c = c*)Number of Active SUs( c = c

1)

Number of Active SUs ( c = c2)

Number of Active SUs ( c = c3)

Number of Active SUs ( c = c*)

Fig. 12. Number of active SUs and coalitions of SUs as a function of γ:J = 2, M = K = 50 and c1 = norm ([1, 1]), c2 = norm

([1,

√2])

,c3 = norm ([1, 2]) where norm (v) = v

∥v∥ . c∗ is the optimal c

calculated using (24).

that the payoff sum of SUs also decreases with γ. Notethat slight payoff jumps can be observed at some pointin Figures 7 and 9. This is because when the value ofγ decreases, some new SUs will be activated to join thecompetition with the existing SUs in some coalitions or sub-bands. This will eventually cause some SUs to change theiroriginal sub-bands. When the sub-band allocation schemechanges, the SE of the hierarchical game will also change,which will cause changes to the SU transmit powers onthese sub-band as well as the payoffs (or payoff sum) ofthe PU (or SUs). However, these SE changes during thepricing decreasing process in Algorithm 2 do not violateour conclusion in Theorem 2 because we always force thepricing coefficient of the PU to decrease to the interferencetemperature limit in (2), which as proved in Theorem 1 isalways an SE for the resulting sub-band allocation scheme.In addition, as observed in Figure 10, where we comparethe number of active SUs and the number of coalitionsunder different γ, the number of coalitions formed by SUsis usually small and hence the SE changes are also limitedduring the decreasing process of γ in Algorithm 2.

10 20 30 40 50 60 70 80 90 1000

2

4

6

8

10

12

14

16x 10

4

Pay

off o

f P1

γ

10 20 30 40 50 60 70 80 90 1000

1

2

3x 10

4

Pay

off o

f P2

10 20 30 40 50 60 70 80 90 1000

1

2

3x 10

4

10 20 30 40 50 60 70 80 90 1000

1

2

3x 10

4

10 20 30 40 50 60 70 80 90 1000

1

2

3x 10

4

Payoff of P1 ( c = c

1)

Payoff of P2 ( c = c

1)

Payoff of P1 ( c = c

2)

Payoff of P2 ( c = c

2)

Payoff of P1 ( c = c

3)

Payoff of P2 ( c = c

3)

Payoff of P1 ( c = c*)

Payoff of P2 ( c = c*)

Fig. 13. Payoff of P1 and P2 as a function of γ: J = 2, M = K = 50

and c1 = norm ([1, 1]), c2 = norm([1,

√2])

, c3 = norm ([1, 2]). c∗

is the optimal c calculated using (24).

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

Number of Sub−bands

Num

ber

of C

oalit

ions

or

Act

ive

SU

s

Number of Coalitions for SUsNumber of Active SUs

Fig. 14. Number of coalitions and active SUs as a function of the numberof available sub-bands: J = 2,K = 100.

To investigate the performance of CR networks withmultiple PUs, we consider a planar network in which JPUs, P1, P2, . . ., PJ , are located in the center, and all theSUs are randomly located as shown in Figure 11. We firstconsider the two PU case (J = 2) in Figures 12 – 14.Our simulation result in Figure 12 shows that different cmay lead to a different number of active SUs, as well asthe coalitions formed among SUs. We compare the payoffsof P1 and P2 under different values of γ in Figure 13.It is observed that for all four predefined values of c, thepayoffs of both PUs decrease with γ. In addition, the valueof c affects the payoffs of both PUs. In Figure 13, wepresent the payoffs obtained by two PUs under different γ.Similar to Figure 7, the payoffs of both PUs decrease withγ and the payoff of the PUs is affected by c. Note that inFigures 12 and 13, we also present the numerical resultsof PUs and SUs with the optimal c∗ calculated in (24).It is observed that c∗ greatly improves the payoff of bothPUs. However, as mentioned at the end of Section III, the

Page 13: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

13

2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70

80

90

100N

umbe

r of

Coa

litio

ns o

r A

ctiv

e S

Us

Number of PUs

Number of Coalitions for SUs ( c = c1)

Number of Coalitions for SUs ( c = c2)

Number of Coalitions for SUs ( c = c3)

Number of Active SUs ( c = c1)

Number of Active SUs ( c = c2)

Number of Active SUs ( c = c3)

Fig. 15. Number of coalitions and active SUs as a function of thenumber of PUs: M = K = 100 and c1 = norm ([1, 1, . . . , 1]),c2 = norm

([1,

√2, . . . ,

√J ])

, c3 = norm ([1, 2, . . . , J ]).

optimization problem of c becomes very complex when thenumber of PUs and SUs is large. In addition, calculating theoptimal c∗ generally requires each PU to know the globalinformation, which may not possible for some CR networkswith a large number of PUs. The number of coalitions forSUs under different numbers of available sub-bands arepresented in Figure 14, which verifies our observation inSection III-A that the core of the sub-band allocation gameis always empty when the secondary sources are close toeach other. More specifically, it is observed that the numberof coalitions decreases dramatically when the number ofsub-bands approaches or exceeds the number of SUs. Thisalso confirms our motivation for Algorithm 2, where welet the PUs choose the high initial pricing coefficients inthe beginning and then gradually decrease βP to allowthe high payoff SUs to join the licensed spectrum first. Inother words, a low initial value of βP might cause manylow payoff SUs to compete and negotiate with the highpayoff ones, which would introduce more communicationoverhead.

In Figures 15 and 16, we present the performance ofSU and PU networks under different numbers of PUs. Itis observed that the number of PUs affects the coalitionformation process as well as the active SUs in the CRnetwork. In Figure 16, we observe that the average payoffof PUs decreases with the number of PUs. This is becausethe more the PUs, the more limits on the transmit powersof SUs.

VI. CONCLUSION

We have considered OFDMA-based cooperative CR net-works in which the licensed spectrum has been divided intomany sub-bands, and each SU can access at most one sub-band. We have first addressed the problem of distributedallocation of sub-bands to SUs by using a coalition for-mation game-based framework. Then we have consideredthe joint optimization of both SU and PU networks by

2 4 6 8 10 12 14 16 18 20

50

100

150

200

250

Number of PUs

Ave

rage

Pay

off

Average Payoff of PUs ( c = c

1)

Average Payoff of PUs ( c = c2)

Average Payoff of PUs ( c = c3)

Fig. 16. Average payoff of PUs as a function of the number ofPUs: M = K = 100 and c1 = norm ([1, 1, . . . , 1]), c2 =

norm([1,

√2, . . . ,

√J ])

, c3 = norm ([1, 2, . . . , J ]).

introducing a Stackelberg game-based hierarchical formu-lation. In this game, PUs control the price at which SUscan access a sub-band so that the total interference ateach PU is below a predefined threshold. By fitting thecoalition formation game into the hierarchical framework,we show that the optimization of sub-band allocation ofSUs, transmit powers of SUs and the pricing function ofPUs are linked with each other, which makes the jointoptimization possible. We have proved that if the pricingcoefficients of PUs follow a fixed linear relationship, theSE of the hierarchical game framework will be unique andoptimal and the sub-band allocation of SUs will be stable.Our proposed hierarchical game theoretic framework can beused to study the interactions in CR networks with morecomplex payoff functions, i.e., the benefit of each SU can bethroughput, outage probability, etc., and the benefit of PUcan be complex polynomial functions [5] of the resultinginterference of SUs. As long as the pricing functions ofPUs are monotone functions of the resulting interferencecaused by SUs, PUs can always use the price to limit SUs’interference level within the tolerable range and SUs cancooperate or compete with each other when they access thespectrum of PUs. Another possible extension of our workis to develop simple and effective methods to distributedlyapproach the optimal pricing coefficients of PU networks.

APPENDIX AEXTENDING TO OTHER COOPERATIVE MODES

Let us briefly describe how to extend the results of thispaper into other cooperation modes, i.e., transmitter coop-eration and receiver cooperation. Due to space limitations,we mainly describe the transmitter cooperation case andthe receiver cooperation can be similarly obtained. It hasbeen observed that a CR network often operates in a lowSNR scenario and hence it is necessary to apply someinterference mitigation methods to avoid cross interferencebetween SUs. In this section, we assume all the secondary

Page 14: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

14

sources can cooperate with each other to accomplish trans-mit beamforming. Following the same line as [37] [38] [39],let us introduce the encoding coefficient τSk

for each SUSk and then we can write the received signal on the kthsecondary destinations as follows,

ySk= τSk

gkkxSk+∑j =k

j∈Lm

τSj gjkxSj + σSk+ σ′

Sk. (29)

If each SU can choose τSjsuch that

∑j =k

j∈Lm

τSjgjkxSj

= 0

∀j ∈ Lm, then we can convert the transmitter cooperationmethod for the CR network into a virtual MIMO channel.Then following the same line as that in Appendix B, wecan prove that the payoff of each cooperative SU in thecoalition Lm is given by

πSk[m] = log

(1 +

|τSk[m]gkk[m]|2

σ2Sk

+ σ′2Sk

wSk

)−βT

Ph•kwSk(30)

By replacing vSk[m] using

vSk[m] =

|gkk[m]|2, if Lm = lSk

,|τSk

[m]gkk[m]|2

σ2Sk

+σ′2Sk

, if |Lm| ≥ 2,(31)

we can obtain the results, i.e., the optimal transmit powersin (20) and (21), the optimal payoff division in Proposi-tion 1 and the sub-band allocation and joint optimizationalgorithms (Algorithms 1 and 2), etc., for the transmittercooperation-based CR networks. Similarly, we can obtainthe results by allowing the receivers to cooperate with eachother using the receive beamforming method [37], [40].

APPENDIX BPROOF OF PROPOSITION 1

The optimal transmit power of Sk in the coalition Lm

is obtained by finding the value of wSk[m] that maximizes

πSk[m] in (16). Following the same line as [34], we can

write the payoff sum of all the cooperative SUs as follows,∑Sk∈Lm

πSk(wSk

,βP ) =∑

Sk∈Lm

(RSk

[m]− βTPh•kwSk

)=

∑Sk∈Lm

log (1 + wSkλSk

)−∑

Sk∈Lm

βTPh•kwSk

.

where λSkis the kth nonzero eigenvalue of GGT .

Maximizing the payoff sum of the cooperative SUs, wecan obtain the resulting w∗

Sk[m] (given in (21)) as the

optimal transmit power for a |Lm|-input |Lm|-output virtualMIMO channel with the power constraint in (2).

By substituting the optimal transmit power w∗Sk[m] into

the capacity sum, we have∑Sk∈Lm

πSk[m]

=∑

Sk∈Lm

log

(vSk

[m]

uSk(βP )

)− 1 +

uSk(βP )

vSk[m]

.

Using the fact that πSk(wSk

,βP ) increases with vSk[m]

if uSk(βP ) < vSk

, we can prove that, for any other pointv′Sk

[m], the following condition is satisfied,

∑Sk∈Lm

∂πSk[m]

∂vSk[m]

∣∣∣∣vSk

[m]=λSk[m]

(v′Sk

[m]− λSk[m])

=∑

Sk∈Lm

[1

λSk[m]

(1− uSk

(βP )

λSk[m]

)·(

v′Sk[m]− λSk

[m])]

⇒∑

Sk∈Lm

v′Sk[m]− λSk

[m]

λSk[m]

≤ 0, (32)

which directly leads to the proportional fairness criterionin [27, Equation (1)] being satisfied.

APPENDIX CPROOF OF PROPOSITION 2

To prove the resulting coalitions in Algorithm 1 arestable, we need to show that, at the end of Algorithm 1,no SUs have the incentive to unilaterally deviate from theirselected sub-band. Suppose that the coalition in Algorithm1 is not stable, i.e., the SU Sk can find another sub-bandl′Sk

= l∗Sk∈ L or coalition which provides a higher payoff

than πSk(l∗Sk

), i.e.,

πSk(l∗Sk

, l∗−Sk,Fl∗Sk

) < πSk(l′Sk

, l∗−Sk,Fl′Sk

) (33)

where l∗Skis the sub-band allocated by Algorithm 1. First,

we can easily prove that l′Sk∈ R∗ because, even if l′Sk

/∈R∗, l′Sk

will be included in R∗ by Step 2-b) in Algorithm1. We then consider the case that l′Sk

can be exclusivelyoccupied by Sk, i.e., Fl′Sk

= ∅. In this case, (33) can berewritten as

πSk(l∗Sk

, l∗−Sk,Fl∗Sk

) < πSk(l′Sk

, l∗−Sk,Fl′Sk

= ∅), (34)

which contradicts Steps 1-b) and 2-b) in Algorithm 1. Let usnow consider the case of Fl′Sk

= ∅, i.e., Sk shares the sub-band l′Sk

with other SUs. It is observed that, if there existsan SU Sj ∈ Ll′Sk

that shares the correlated channels withSk, the resulting payoff allocated by (16) in Proposition 1 toone SU will be much larger than that allocated to the otherSU. In this case, joining the sub-band l′Sk

will either causelow payoff for Sk, which contradicts with (33), or force Sj

to leave l′Sk( l∗Sj

is not the equilibrium point of Sj), whichcontradicts with Step 2-a). If all the channels of SUs inthe sub-band l′Sk

are almost orthogonal to each other, i.e.,joining l′Sk

will not cause existing SUs to leave, (33) canbe rewritten as πSk

(l∗Sk,Fl∗Sk

= ∅) < πSk(l′Sk

,Fl′Sk=

∅). This contradicts with Step 2-a) in Algorithm 1. In thisway, when Algorithm 1 ends, each sub-band will be eitherexclusively occupied by one SU or shared by multiple SUswho have relatively orthogonal channels to each other andno SUs will have intention to leave their optimal sub-bands.This concludes the proof.

Page 15: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

15

APPENDIX DPROOF OF PROPOSITION 3

It is clear from (20) that w∗Sk

> 0 if and only ifuSk

(βP ) < vSk. If w∗

Sk= 0, then πSk

(w∗Sk,βP ) = 0.

Suppose that uSk(βP ) < vSk

. By substituting the optimaltransmit power w∗

Skin (20) into πSk

(w∗Sk,βP ) in (3), we

have

πSk(w∗

Sk,βP ) = log

(vSk

uSk(βP )

)−1 +

uSk(βP )

vSk

≥ 0, (35)

where the inequality comes from the fact that 1 − x ≤exp(−x) for all x.

APPENDIX EPROOF OF PROPOSITION 4

Let us first prove that, given a sub-band allocation,there exists a mixed strategy SE for the price and powercontrol game. It can be shown that βPj takes values in anon-empty compact set. In addition, the payoff functionπPj

(w∗

S(βP ), βPj

)is continuous in this set. Therefore,

there exists a mixed strategy SE [41]. Let us now focuson proving the existence of a pure strategy SE for the hier-archical game. Let Ak =

βP : βT

Ph•k < vSk∀ Sk ∈ L

where the set L of active SUs is fixed. Hence βP ∈

∩k∈L

Ak

takes values in a convex set. Suppose there exists a βP

satisfying the condition in (23). Let us now show thatπPj

(w∗Sk(βP ), βPj

) is a quasi-concave function of βPj.

Substituting (20) into (4), we have

πPj

(w∗

S , βPj

)

=∑Sk∈L

|hjk|2

βPj

βPj |hjk|2 +J∑

i=1i =j

βPi |hik|2−

βPj

vSk

=∑Sk∈L

|hjk|2

1

|hjk|2−

J∑i=1i =j

βPi |hik|2

βPj |hjk|2 +J∑

i=1i =j

βPi |hik|2

−βPj

vSk

)+

.

(36)

Evaluating the second derivative of πPj

(w∗

S , βPj

), we

obtain

∂2πPj

(w∗

S , βPj

)∂β2

Pj

= −∑Sk∈L

|hjk|2

2|hjk|4

J∑i=1i =j

βPi |hik|2

(J∑

i=1

βPi |hik|2)3

< 0.

(37)

Therefore, πPj

(w∗

S , βPj

)is a concave function of βPj .

In other words, we can always find a β∗Pj

to maximizethe value of πPj

(w∗

S , βPj

)with the given β−Pj

. Thisconcludes the proof.

APPENDIX FPROOF OF PROPOSITION 5

Let xk[l] =vSk

[l]

βTPh•k[l]

. We can re-write πSkin (3) as

follows,

πSk

(w∗

Sk(βP ),βP ; lS

)= log

(1 + |gSk

[l]|2w∗Sk(βP )

)−βT

Ph•k[l]w∗Sk(βP )

= log

(1 + vSk

[l]

(1

βTPh•k[l]

− 1

vSk[l]

)+)

−βTPh•k[l]

(1

βTPh•k[l]

− 1

vSk[l]

)+

=

(log(1 + (xk[l]− 1)+

)−(1− 1

xk[l]

)+).

(38)

If xk[l] ≤ 1, πSk= 0. If xk[l] > 1, we have

∂πSk

∂xk[l]=

(1

xk[l]− 1

x2k[l]

)> 0, (39)

and therefore πSkis an increasing function of xk[l]. To

maximize πSk, the SU Sk will search for the sub-band l∗Sk

that could maximize the value of xk[l]. For the linear priceadjustment, we have βP = γc. We can re-write the optimalsub-band chosen by Sk as follows:

l∗Sk= arg max

l∈1,...,Mxk[l]

=1

γarg max

l∈1,...,M

vSk[l]

cTh•k[l]. (40)

Let us denote yk = arg maxl∈1,...,M

vSk[l]

cTh•k[l], so the solu-

tion in (40) becomes l∗Sk= yk

γ . It is observed that the valueof yk is fixed for each SU Sk. In other words, γ does notaffect the sequence of the payoffs for the SU Sk in all thesub-bands. We hence can claim that decreasing the valueof γ cannot change the sub-band allocation scheme of allSUs if the set of active SUs and the fairness criterion arefixed. This concludes the proof.

Page 16: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

16

APPENDIX GPROOF OF PROPOSITION 6

Let us consider the upper bound of βPj . Suppose thatfor some j, βPj > β ≥ vSk

|hjk|2 for all k ∈ 1, 2, . . . ,K.

We then haveJ∑

j=1

βPj |hjk|2 > vSk, which implies that

uSk(βP ) > vSk

. From Proposition 3, we have w∗Sk(βP ) =

0 for all k and hence no SUs are active in the licensedspectrum. Therefore, we have 0 ≤ βPj ≤ β for all j.

APPENDIX HPROOF OF THEOREM 2

The claim E1) in Theorem 2 comes from the fact thatw∗

Sk(β∗

P ) is the optimal transmit power for Sk with thegiven l∗Sk

for all k ∈ 1, 2, . . . ,K. The claim E2) is adirect result from Proposition 2. Claim E3) is proved inProposition 4. Let us consider the claim E4) as follows. Bysubstituting β = γc into (4), we can show that

πPj

(w∗

S(βP ), βPj

)= βPj

K∑k=1

|hjk|2(

1

βTPh•k

− 1

vSk

)+

= cj

K∑k=1

|hjk|2(

1

cTh•K− γ

vSk

)+

(41)

is increasing as γ decreases. Following the same line asProposition 4, we can claim that (γ∗,w∗

S) is the SE. Theproof is now complete.

APPENDIX IPROOF OF PROPOSITION 7

Let us first consider the lower bound of the number ofiterations. From Proposition 3, for an SU Sk to transmit inthe sub-band l with positive power, we require that

(1− ϵ)nhcT1 ≤ (1− ϵ)ncTh•k[l] < vSk[l] ≤ v. (42)

Therefore,

n log(1− ϵ) ≤ log(v)− log(hcT1) and (43)

n ≥log(hcT1

)− log(v)

log(1− ϵ)−1. (44)

Let us consider the upper bound of the number ofiterations. We can rewrite the power constraint in (2) asfollows:

1

γ

K∑k=1

|hjk|2αSk

cTh•k≤ qPj +

K∑k=1

|hjk|2αSk

vSk

⇒ 1

γh

K∑k=1

|hjk|2αSk

cT1≤ q +

K∑k=1

|hjk|2αSk

v

⇒ γ ≥ 1

hcT1

qK∑

k=1

|hjk|2αSk

+1

v

−1

≥ 1

hcT1

(q

Khα+

1

v

)−1

(45)

We hence have γ = (1 − ϵ)n ≥ 1hcT 1

(q

Khα + 1v

)−1

which leads to the following result,

n ≤log(hcT1

)+ log

(q

Khα + 1v

)log (1− ϵ)

−1 . (46)

ACKNOWLEDGMENT

The authors would like to thank Professor Wee Peng Tayfor his helpful comments.

REFERENCES

[1] E. Hossain, D. Niyato, and Z. Han, Dynamic spectrum access andmanagement in cognitive radio networks. Cambridge UniversityPress, 2009.

[2] Q. Zhao and B. Sadler, “A survey of dynamic spectrum access,”IEEE Signal Processing Magazine, vol. 24, no. 3, pp. 79–89, 2007.

[3] M. Gastpar, “On capacity under receive and spatial spectrum-sharingconstraints,” IEEE Trans. Inf. Theory, vol. 53, no. 2, pp. 471–487,2007.

[4] G. Amir and S. S. Elvino, “Fundamental limits of spectrum-sharingin fading environments,” IEEE Trans. Wireless Commun., vol. 6,no. 2, pp. 649–658, 2007.

[5] Y. Xiao, G. Bi, and D. Niyato, “Game theoretic analysis for spectrumsharing with multi-hop relaying,” IEEE Trans. Wireless Commun.,vol. 10, pp. 1527 –1537, May 2011.

[6] Y. Xiao, G. Bi, and D. Niyato, “A simple distributed power con-trol algorithm for cognitive radio networks,” IEEE Trans. WirelessCommun., vol. 10, pp. 3594–3600, November 2011.

[7] D. Niyato and E. Hossain, “Competitive spectrum sharing in cog-nitive radio networks: a dynamic game approach,” IEEE Trans.Wireless Commun., vol. 7, no. 7, pp. 2651–2660, 2008.

[8] S. Haykin, “Cognitive radio: brain-empowered wireless communica-tions,” IEEE J. Select. Areas Commun., vol. 23, pp. 201–220, 2005.

[9] Y. T. Hou, S. Yi, and H. D. Sherali, “Spectrum sharing for multi-hopnetworking with cognitive radios,” IEEE J. Select. Areas Commun.,vol. 26, no. 1, pp. 146–155, 2008.

[10] H. Xu and B. Li, “Efficient resource allocation with flexible channelcooperation in OFDMA cognitive radio networks,” in Proc. IEEEINFOCOM, San Diego, CA, March 2010.

[11] Z. Han, Z. Ji, and K. Liu, “Fair multiuser channel allocation forOFDMA networks using Nash bargaining solutions and coalitions,”IEEE Trans. Commun., vol. 53, no. 8, pp. 1366–1376, 2005.

[12] R. La and V. Anantharam, “A game-theoretic look at the gaussianmultiaccess channel,” in DIMACS Series in Discrete Mathmatics andTheoretical Computer Science, vol. 66, pp. 87–106, 2004.

[13] S. Mathur, L. Sankar, and N. Mandayam, “Coalitions in cooperativewireless networks,” IEEE J. Select. Areas Commun., vol. 26, no. 7,pp. 1104–1115, 2008.

[14] W. Saad, Z. Han, M. Debbah, and A. Hjorungnes, “A distributedcoalition formation framework for fair user cooperation in wirelessnetworks,” IEEE Trans. Wireless Commun., vol. 8, no. 9, pp. 4580–4593, 2009.

[15] Y. Xiao, G. Bi, and D. Niyato, “Distributed optimization for cogni-tive radio networks using Stackelberg game,” in IEEE InternationalConference on Communication Systems (ICCS), Singapore, 17-20Nov. 2010.

[16] M. Bloem, T. Alpcan, and T. Basar, “A Stackelberg game forpower control and channel allocation in cognitive radio networks,”in Proceedings of the 2nd international conference on performanceevaluation methodologies and tools, 2007.

[17] M. Razaviyayn, M. Yao, and L. Zhi-Quan, “A Stackelberg game ap-proach to distributed spectrum management,” in IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSP),Dallas, TX, 2010.

[18] A. Daoud, T. Alpcan, S. Agarwal, and M. Alanyali, “A Stackelberggame for pricing uplink power in wide-band cognitive radio net-works,” in 47th IEEE Conference on Decision and Control (CDC),Cancun, Mexico, 2008.

[19] Y. Su and M. Van Der Schaar, “A new perspective on multi-userpower control games in interference channels,” IEEE Trans. WirelessCommun., vol. 8, no. 6, pp. 2910–2919, 2009.

Page 17: A Hierarchical Game Theoretic Framework for Cognitive ...eic.hust.edu.cn/professor/xiaoyong/2012JSACStackelbergGameCR.pdf1 A Hierarchical Game Theoretic Framework for Cognitive Radio

17

[20] B. Wang, Z. Han, and K. J. R. Liu, “Distributed relay selection andpower control for multiuser cooperative communication networksusing Stackelberg game,” IEEE Trans. Mobile Computing, vol. 8,no. 7, pp. 975–990, 2009.

[21] Y. Wu, T. Zhang, and D. Tsang, “Joint pricing and power alloca-tion for dynamic spectrum access networks with Stackelberg gamemodel,” IEEE Trans. Wireless Commun., vol. 10, no. 1, pp. 12–19,2011.

[22] Y. Xiao and L. A. Dasilva, “Dynamic pricing coalitional gamefor cognitive radio networks,” in IFIP Networking 2012 Workshop,Prague, Czech Republic, 25 May, 2012.

[23] G. Fagiolo, “Endogenous neighborhood formation in a local co-ordination model with negative network externalities,” Journal ofEconomic Dynamics and Control, vol. 29, no. 1, pp. 297–319, 2005.

[24] C. Wang, Y. Chen, and K. J. R. Liu, “Chinese restaurant game-part i:Theory of learning with negative network externality,” Arxiv preprintarXiv:1112.2188, 2011.

[25] B. Wang, K. J. R. Liu, and T. C. Clancy, “Evolutionary cooperativespectrum sensing game: how to collaborate?,” IEEE Trans. Com-mun., vol. 58, no. 3, pp. 890–900, 2010.

[26] H. Li, “Multi-agent q-learning for aloha-like spectrum access incognitive radio systems,” EURASIP Journal on Wireless Commu-nications and Networking, vol. 2010, 2010.

[27] F. Kelly, A. Maulloo, and D. Tan, “Rate control for communicationnetworks: shadow prices, proportional fairness and stability,” Journalof the Operational Research Society, pp. 237–252, 1998.

[28] F. Kelly, “Charging and rate control for elastic traffic,” EuropeanTransactions on Telecommunications, vol. 8, no. 1, pp. 33–37, 1997.

[29] R. Myerson, Game theory: analysis of conflict. Harvard UniversityPress, 1997.

[30] K. Lin, S. Gollakota, and D. Katabi, “Random access heterogeneousmimo networks,” in ACM SIGCOMM’11, 15-19 August, 2011,Toronto, Ontario, Canada.

[31] S. Gollakota, S. Perli, and D. Katabi, “Interference alignment andcancellation,” in ACM SIGCOMM’09, 17 - 21 August , 2009,Barcelona, Spain.

[32] M. Osborne, An introduction to game theory. Oxford UniversityPress, New York, NY, 2004.

[33] T. Basar and G. Olsder, Dynamic noncooperative game theory(Series in Classics in Applied Mathematics). Philadelphia, PA:SIAM, 1999.

[34] E. Telatar, “Capacity of multi-antenna gaussian channels,” EuropeanTransactions on Telecommunications, vol. 10, no. 6, pp. 585–595,1999.

[35] S. P. Boyd and L. Vandenberghe, Convex optimization. CambridgeUniversity Press, 2004.

[36] X. Zhou and H. Zheng, “Trust: A general framework for truthfuldouble spectrum auctions,” in Proc. of IEEE INFOCOM, Rio deJaneiro, Brazil, April, 2009.

[37] A. Gershman, N. Sidiropoulos, S. Shahbazpanahi, M. Bengtsson,and B. Ottersten, “Convex optimization-based beamforming,” IEEESignal Processing Magazine, vol. 27, no. 3, pp. 62–75, 2010.

[38] L. Choi and R. Murch, “A transmit preprocessing technique formultiuser mimo systems using a decomposition approach,” IEEETrans. Wireless Commun., vol. 3, no. 1, pp. 20–24, 2004.

[39] D. Tse and P. Viswanath, Fundamentals of wireless communication.Cambridge University Press, 2005.

[40] K. Gomadam and S. Jafar, “Optimal distributed beamforming inrelay networks with common interference,” in IEEE GLOBECOM,pp. 3868–3872, 2007.

[41] D. Fudenberg and J. Tirole, Game Theory. The MIT Press,Cambridge, MA, 1991.

Yong Xiao received his B.S. degree in electri-cal engineering from China University of Geo-sciences, Wuhan, China in 2002, M.Sc. degree intelecommunication from Hong Kong Universityof Science and Technology in 2006, and hisPh.D degree in electrical and electronic engi-neering from Nanyang Technological University,Singapore in 2012. From August 2010 to April2011, he was a research associate in school ofelectrical and electronic engineering, NanyangTechnological University, Singapore.

Currently, he is a research fellow at CTVR, School of Computer Scienceand Statistics, Trinity College Dublin, Ireland. His research interests in-clude information theory for cooperative communication systems, machinelearning and application of game theory in multi-user wireless networks.

Guoan Bi (SM’89) received the B.Sc degreein Radio communications, Dalian University ofTechnology, P. R. China, 1982, M.Sc degree inTelecommunication Systems and Ph.D degreein Electronics Systems, Essex University, UK,1985 and 1988, respectively. Between 1988 and1990,he was a research fellow at the Universityof Surrey, U.K..

Since 1991, he has been with the School ofElectrical and Electronic Engineering, NanyangTechnological University, Singapore. His current

research interests include DSP algorithms and hardware structures, andsignal processing for various applications including sonar, radar, andcommunications.

Dusit Niyato (M’09) is currently an Assis-tant Professor in the Division of ComputerCommunications, School of Computer Engineer-ing, Nanyang Technological University, Singa-pore. He received the B.E. degree from KingMongkuts Institute of Technology Ladkrabang,Bangkok, Thailand, in 1999 and the Ph.D. degreein electrical and computer engineering from theUniversity of Manitoba, Winnipeg, MB, Canada,in 2008. His current research interests includedesign, analysis, and optimization of wireless

communication, smart grid systems, green radio communications, andmobile cloud computing.

Luiz A. DaSilva (SM) currently holds the StokesProfessorship in Telecommunications in the De-partment of Electronic and Electrical Engineer-ing at Trinity College Dublin. He has also beena faculty member in the Bradley Department ofElectrical and Computer Engineering at VirginiaTech since 1998. His research focuses on dis-tributed and adaptive resource management inwireless networks, and in particular cognitiveradio networks and the application of gametheory to wireless networks. He is currently a

principal investigator on research projects funded by the National ScienceFoundation in the United States, the Science Foundation Ireland, andthe European Commission under Framework Programme 7. He is a co-principal investigator of CTVR, the Telecommunications Research Centrein Ireland. He has co-authored two books on wireless communicationsand over 110 peer-reviewed papers in leading journals and conferenceson communications and networks. In 2006 he was named a College ofEngineering Faculty Fellow at Virginia Tech.