Developing and securing the cloud - GBV · Contents ix 4.4 Knowledge Management 59 4.5 Activity...
Transcript of Developing and securing the cloud - GBV · Contents ix 4.4 Knowledge Management 59 4.5 Activity...
Developingand Securingthe Cloud
Bhavani Thuraisingham
@ CRC PressTaylor & Francis GroupBoca Raton London New York
CRC Press is an imprint of the
Taylor & Francis Croup, an Informs business
AN AUERBACH BOOK
Contents
Preface xxvii
Acknowledgments xxxiii
Author xxxv
1 Introduction 1
1.1 About This Book 1
1.2 Supporting Technologies 3
1.2.1 From Mainframe to the Cloud 3
1.2.2 Security Technologies 3
1.2.3 Data, Information, and Knowledge Management 5
1.3 Secure Services Technologies 5
1.3.1 Secure Services Technologies 5
1.3.2 Secure Semantic Services 7
1.3.3 Specialized Secure Services 7
1.4 Cloud Computing Concepts 8
1.5 Experimental Cloud Computing Systems 9
1.6 Secure Cloud Computing 10
1.7 Experimental Secure Cloud Computing Systems 11
1.8 Experimental Cloud Computing for Security Applications 12
1.9 Toward Trustworthy Clouds 12
1.10 Building an Infrastructure, Education Program, and a Research
Program for a Secure Cloud 13
1.11 Organization of This Book 14
1.12 Next Steps 16
Reference 19
PART I SUPPORTING TECHNOLOGIES
2 From Mainframe to the Cloud 23
2.1 Overview 23
2.2 Early Computing Systems 23
vii
viii Contents
2.3 Distributed Computing 25
2.4 World Wide Web 25
2.5 Cloud Computing 26
2.6 Summary and Directions 26
References 27
3 Trustworthy Systems ••••••29
3.1 Overview 29
3.2 Secure Systems 30
3.2.1 Overview 30
3.2.2 Access Control and Other Security Concepts 30
3.2.3 Types of Secure Systems 32
3.2.4 Secure Operating Systems 32
3.2.5 Secure Database Systems 33
3.2.6 Secure Networks 35
3.2.7 . Emerging Trends 35
3.2.8 Impact of the Web 37
3.2.9 Steps to Building Secure Systems 37
3.3 Dependable Systems 38
3.3.1 Overview 38
3.3.2 Trust Management 40
3.3.3 Digital Rights Management 40
3.3.4 Privacy 41
3.3.5 Integrity, Data Quality, and High Assurance 41
3.4 Security Threats and Solutions 42
3.5 Building Secure Systems from Untrusted Components 45
3.6 Summary and Directions 46
References 47
4 Data, Information, and Knowledge Management 49
4.1 Overview 49
4.2 Data Management 50
4.2.1 Data Management 50
4.2.1.1 Data Model 50
4.2.1.2 Functions 50
4.2.1.3 Data Distribution 51
4.2.1.4 Web Data Management 51
4.2.2 Complex Data Management 53
4.2.2.1 Multimedia Data Systems 53
4.2.2.2 Geospatial Data Management 54
4.3 Information Management 55
4.3.1 Data Warehousing and Data Mining 55
4.3.2 Information Retrieval 56
4.3.3 Search Engines 57
Contents ix
4.4 Knowledge Management 59
4.5 Activity Management 60
4.5.1 E-Business and E-Commerce 60
4.5.2 Collaboration and Workflow 62
4.5.3 Information Integration 63
4.5.4 Information Sharing 64
4.5.5 Social Networking 65
4.5.6 Supply Chain Management 66
4.6 Summary and Directions 67
References 67
Conclusion to Part 1 69
PART II SECURE SERVICES TECHNOLOGIES
5 Service-Oriented Computing and Security 73
5.1 Overview 73
5.2 Service-Oriented Computing 75
5.2.1 Services Paradigm 75
5.2.1.1 SOAs and Web Services 76
5.2.1.2 SOA and Design 76
5.2.2 SOA and Web Services 77
5.2.2.1 WS Model 79
5.2.2.2 Composition ofWS 80
5.2.2.3 WS Protocols 81
5.2.2.4 Rest 83
5.2.3 Service-Oriented Analysis and Design 83
5.2.3.1 IBM Service-Oriented Analysis and Design 86
5.2.3.2 Service-Oriented Modeling Framework 87
5.2.3.3 UML for Services 87
5.3 Secure Service-Oriented Computing 87
5.3.1 Secure Services Paradigm 87
5.3.2 Secure SOA and WS 90
5.3.2.1 WS-Security 91
5.3.2.2 WS-*Security 93
5.3.3 Secure SOAD 96
5.3.3.1 Secure SOMA 99
5.3.3.2 Secure SOMF 99
5.3.3.3 Secure UML for Services 100
5.3.4 Access Control for WS 100
5.3.4.1 Security Assertions Markup Language 100
5.3.4.2 eXtensible Access Control Markup Language.... 101
5.3.5 Digital Identity Management 103
5.3.5.1 OpenID 105
x Contents
5.3.5.2 Shibboleth 106
5.3.5.3 Liberty Alliance 107
5.3.6 Security Models for WS 108
5.3.6.1 Delegation Model 109
5.3.6.2 Information Flow Model 110
5.3.6.3 Multilevel Secure WS 112
5.4 Summary and Directions 112
References 113
6 Semantic Web Services and Security 117
6.1 Overview 117
6.2 Semantic Web 119
6.2.1 Layered Technology Stack 119
6.2.2 eXtensible Markup Language 120
6.2.2.1 XML Statement and Elements 120
' 6.2.2.2 XML Attributes 120
6.2.2.3 XML DTDs 121
6.2.2.4 XMLSchemas 121
6.2.2.5 XML Namespaces 121
6.2.2.6 XML Federations/Distribution 122
6.2.2.7 XML-QL, XQuery, XPath, XSLT 122
6.2.3 Resource Description Framework 122
6.2.3.1 RDF Basics 123
6.2.3.2 RDF Container Model 123
6.2.3.3 RDF Specification 124
6.2.3.4 RDFSchemas 124
6.2.3.5 RDF Axiomatic Semantics 124
6.2.3.6 RDF Inferencing 125
6.2.3.7 RDF Query 125
6.2.3.8 SPARQL Protocol and RDF Query Language.. 125
6.2.4 Ontologies 125
6.2.5 Web Rules and SWRL 127
6.2.5.1 Web Rules 127
6.2.5.2 Semantic Web Rules Language 128
6.2.6 Semantic Web Services 129
6.3 Secure Semantic Web Services 130
6.3.1 Security for the Semantic Web 130
6.3.2 XML Security 132
6.3.3 RDF Security 132
6.3.4 Security and Ontologies 133
6.3.5 Secure Query and Rules Processing 134
6.3.6 Privacy and Trust for the Semantic Web 134
6.3.7 Secure Semantic Web and WS 137
Contents xi
6.4 Summary and Directions 138
References 139
7 Specialized Web Services and Security. 141
7.1 Overview 141
7.2 Specialized Web Services 142
7.2.1 Overview 142
7.2.2 Web Services for Data Management 142
7.2.3 Web Services for Complex Data Management 143
7.2.4 Web Services for Information Management 145
7.2.5 Web Services for Knowledge Management 146
7.2.6 Web Services for Activity Management 147
7.2.6.1 E-Business 147
7.2.6.2 Collaboration and Workflow 148
7.2.6.3 Information Integration 149
7.2.6.4 Other Activities 149
7.2.7 Domain Web Services 150
7.2.7.1 Defense 150
7.2.7.2 Healthcare and Life Sciences 151
7.2.7.3 Finance 151
7.2.7.4 Telecommunication 152
7.2.8 Emerging Web Services 153
7.2.8.1 X as a Service 153
7.2.8.2 Data as a Service 153
7.2.8.3 Software as a Service 154
7.2.8.4 Other X as a Service 155
7.3 Secure Specialized Web Services 156
7.3.1 Overview 156
7.3.2 Web Services for Secure Data Management 157
7.3.3 Web Services for Secure Complex Data Management 157
7.3.3.1 Secure Geospatial Data Management 157
7.3.3.2 Secure Multimedia Data Management 161
7.3.4 Web Services for Secure Information Management 162
7.3.5 Web Services for Secure Knowledge Management 163
7.3.6 Secure Web Services for Activity Management 163
7.3.6.1 Secure E-Commerce 163
7.3.6.2 Secure Supply Chain Management 165
7.3.6.3 Secure Workflow and Collaboration 165
7.3.7 Secure Domain Web Services 170
7.3.7.1 Defense 170
7.3.7.2 Healthcare and Lifecycles 170
7.3.7.3 Finance 171
7.3.7.4 Other Domains 171
xii Contents
7.3.8 Emerging Secure Web Services 171
7.3.8.1 Security for X as a Service 171
7.3.8.2 Security for Amazon Web Services 172
7.3.8.3 Secure Web Services for Cloud and Grid 173
7.4 Summary and Directions 173
References 174
Conclusion to Part II 177
PART HI CLOUD COMPUTING CONCEPTS
8 Cloud Computing Concepts 181
8.1 Overview 181
8.2 Preliminaries in Cloud Computing 182
8.2.1 Cloud Deployment Models 182
8.2.2.
Service Models 183
8.3 Virtualization 184
8.4 Cloud Storage and Data Management 185
8.5 Summary and Directions 187
References 187
9 Cloud Computing Functions 189
9.1 Overview 189
9.2 Cloud Computing Framework 190
9.3 Cloud OSs and Hypervisors 191
9.4 Cloud Networks 192
9.5 Cloud Data and Storage Management 193
9.6 Cloud Applications 195
9.7 Cloud Policy Management, Back-Up, and Recovery 195
9.8 Summary and Directions 196
References 196
10 Cloud Data Management 199
10.1 Overview 199
10.2 Relational Data Model 200
10.3 Architectural Issues 201
10.4 DBMS Functions 204
10.4.1 Overview 204
10.4.2 Query Processing 205
10.4.3 Transaction Management 207
10.4.4 Storage Management 208
10.4.5 Metadata Management 210
10.4.6 Database Integrity 211
10.4.7 Fault Tolerance 212
Contents xiii
10.5 Data Mining 212
10.6 Other Aspects 214
10.7 Summary and Directions 215
References 216
11 Specialized Clouds, Services, and Applications 217
11.1 Overview 217
11.2 Specialized Clouds 218
11.2.1 Mobile Clouds 218
11.2.2 Multimedia Clouds 219
11.3 Cloud Applications 220
11.4 Summary and Directions 221
References 222
12 Cloud Service Providers, Products, and Frameworks 223
12.1 Overview 223
12.2 Cloud Service Providers, Products, and Frameworks 224
12.2.1 Cloud Service Providers 224
12.2.1.1 Windows Azure 226
12.2.1.2 Google App Engine 227
12.2.2 Cloud Products 228
12.2.2.1 Oracle Enterprise Manager 228
12.2.2.2 IBM Smart Cloud 229
12.2.2.3 Hypervisor Products 230
12.2.3 Cloud Frameworks 230
12.2.3.1 Hadoop, MapReduce Framework 230
12.2.3.2 Storm 232
12.2.3.3 HIVE 232
12.3 Summary and Directions 233
References 234
Conclusion to Part III 235
PART IV EXPERIMENTAL CLOUD COMPUTING SYSTEMS
13 Experimental Cloud Query Processing System 239
13.1 Overview 239
13.2 Our Approach 241
13.3 Related Work 242
13-4 Architecture 245
13.4.1 Data Generation and Storage 246
13.4.2 File Organization 247
13.4.3 Predicate Split 247
13.4.4 Split Using Explicit-Type Information of Object 247
xiv Contents
13.4.5 Split Using Implicit-Type Information of Object 247
13.5 MapReduce Framework 248
13.5.1 Overview 248
13.5.2 Input Files Selection 248
13.5.3 Cost Estimation for Query Processing 250
13.5.3.1 Ideal Model 251
13.5.3.2 Heuristic Model 252
13.5.4 Query Plan Generation 255
13.5.4.1 Computational Complexity of Bestplan 257
13.5.4.2 Relaxed Bestplan Problem and ApproximateSolution 257
13.5.5 Breaking Ties by Summary Statistics 259
13.5.6 MapReduce Join Execution 260
13.6 Results 261
13.6.1 Data Sets, Frameworks, and Experimental Setup 262
13.6.1.1 Data Sets 262
13.6.1.2 Baseline Frameworks 262
13.6.1.3 Experimental Setup 262
13.6.2 Evaluation 262
13.7 Summary and Directions 265
References 265
14 Social Networking on the Cloud 269
14.1 Overview 269
14.2 Foundational Technologies for SNODSOC and
SNODSOC++ 271
14.2.1 SNOD 271
14.2.2 Location Extraction 271
14.2.3 Entity/Concept Extraction and Integration 272
14.2.3.1 Linguistic Extensions 273
14.2.3.2 Extralinguistic Extensions 273
14.2.3.3 Entity Integration 273
14.2.4 Ontology Construction 273
14:2.5 Cloud Query Processing 274
14.2.5.1 Preprocessing 274
14.2.5.2 Query Execution and Optimization 275
14.3 Design of SNODSOC 275
14.3.1 Overview of the Modules 275
14.3.2 SNODSOC and Trend Analysis 276
14.3.2.1 Novel Class Detection 277
14.3.2.2 Storing the Cluster Summary Information 278
14.3.3 Content-Driven Location Extraction 280
14.3.3.1 Motivation 281
Contents xv
14.3.3.2 Challenges: Proposed Approach 282
14.3.3.3 Using Gazetteer and Natural LanguageProcessing 285
14.3.4 Categorization 287
14.3.5 Ontology Construction 289
14.4 Toward SNODSOC++ 290
14.4.1 Benefits ofSNOD++ 291
14.5 Cloud-Based Social Network Analysis 291
14.5.1 Stream Processing 292
14.5.2 Twitter Storm for SNODSOC 293
14.6 Related Work 293
14.7 Summary and Directions 294References 295
15 Experimental Semantic Web-Based Cloud Computing Systems 297
15.1 Overview 297
15-2 Jena-HBase: A Distributed, Scalable, and Efficient RDF
Triple Store 298
15-3 StormRider: Harnessing "Storm" for Social Networks 300
15.4 Ontology-Driven Query Expansion Using Map/ReduceFramework 303
15.4.1 BET Calculation Using MapReduce Distributed
Computing 304
15.4.1.1 Shortest Path Calculation Using Iterative
MapReduce Algorithm 305
15.4.1.2 Betweenness and Centrality Measures
Using Map/Reduce Computation 307
15.4.1.3 SSMs Using Map/Reduce Algorithm 307
15.5 Summary and Directions 307
References 308
Conclusion to Part IV 311
PART V SECURE CLOUD COMPUTING CONCEPTS
16 Secure Cloud Computing Concepts 315
16.1 Overview 315
16.2 Secure Cloud Computing and Governance 316
16.3 Security Architecture 318
16.4 Identity Management and Access Control 320
16.4.1 Cloud Identity Administration 320
16.5 Cloud Storage and Data Security 322
16.6 Privacy, Compliance, and Forensics for the Cloud 323
xvi Contents
16.6.1 Privacy 323
16.6.2 Regulations and Compliance 324
16.6.3 Cloud Forensics 324
16.7 Cryptogaphic Solutions 324
16.8 Network Security 326
16.9 Business Continuity Planning 326
16.10 Operations Management 327
16.11 Physical Security 328
16.12 Summary and Directions 328
References 329
17 Secure Cloud Computing Functions 331
17.1 Overview 331
17.2 Secure Cloud Computing Framework 332
17.3 Secure Cloud OSs and Hypervisors 333
17.4 Secure-Cloud Networks 335
17.5 Secure Cloud Storage Management 335
17.6 Secure Cloud Data Management 336
17.7 Cloud Security and Integrity Management 336
17.8 Secure Cloud Applications 337
17.9 Summary and Directions 337
References 338
18 Secure Cloud Data Management 339
18.1 Overview 339
18.2 Secure Data Management 340
18.2.1 Access Control 340
18.2.2 Inference Problem 340
18.2.3 Secure Distributed/Heterogeneous Data
Management 342
18.2.4 Secure Object Data Systems 343
18.2.5 Data Warehousing, Data Mining, Security,and Privacy 343
18.2-6 Secure Information Management 345
18.2.7 Secure Knowledge Management 346
18.3 Impact of the Cloud 346
18.3.1 Discretionary Security 346
18.3.2 Inference Problem 347
18.3.3 Secure Distributed and Heterogeneous Data
Management 347
18.3.4 Secure Object Systems 348
18.3.5 Data Warehousing, Data Mining, Security,and Privacy 348
Contents xvii
18.3.6 Secure Information Management 349
18.3.7 Secure Knowledge Management 349
18.4 Summary and Directions 349
References 350
19 Secure Cloud Computing Guidelines 351
19.1 Overview 351
19.2 The Guidelines 352
19.3 Summary and Directions 356
References 357
20 Security as a Service 359
20.1 Overview 359
20.2 Data Mining Services for Cyber Security Applications 360
20.2.1 Overview 360
20.2.2 Cyber Terrorism, Insider Threats, and External Attacks ...361
20.2.3 Malicious Intrusions 362
20.2.4 Credit Card Fraud and Identity Theft 362
20.2.5 Attacks on Critical Infrastructures 363
20.2.6 Data Mining Services for Cyber Security 363
20.3 Current Research on Security as a Service 365
20.4 Other Services for Cyber Security Applications 366
20.5 Summary and Directions 367
References 367
21 Secure Cloud Computing Products 369
21.1 Overview 369
21.2 Overview of the Products 370
21.3 Summary and Directions 373
References 373
Conclusion to Part V 375
PART VI EXPERIMENTAL SECURE CLOUD COMPUTING
SYSTEMS
22 Secure Cloud Query Processing with Relational Data 379
22.1 Overview 379
22.2 Related Work 381
22.3 System Architecture 382
22.3.1 The Web Application Layer 382
22.3.2 The ZQL Parser Layer 382
22.3.3 The XACML Policy Layer 384
xviii Contents
22.3.3.1 XACML Policy Builder 384
22.3.3.2 XACML Policy Evaluator 384
22.3.3.3 The Basic Query Rewriting Layer 384
22.3.3.4 The Hive Layer 385
22.3.3.5 HDFS Layer 385
22.4 Implementation Details and Results 386
22.4.1 Implementation Setup 386
22.4.2 Experimental Datasets 386
22.4.3 Implementation Results 387
22.5 Summary and Directions 388
References 389
23 Secure Cloud Query Processing with SemanticWeb Data. 391
23.1 Overview 391
23.2 Background 393
23.2.1 Related Work 393
23.2.1.1 Hadoop and MapReduce 394
23.3 Access Control 394
23.3.1 Model 394
23.3.2 AT Assignment 396
23.3.2.1 Final Output of an Agent's ATs 397
23.3.2.2 Security Level Defaults 397
23.3.3 Conflicts 397
23.4 System Architecture 399
23.4.1 Overview of the Architecture 399
23.4.1.1 Data Generation and Storage 400
23.4.1.2 Example Data 400
23.5 Policy Enforcement 401
23.5.1 Query Rewriting 401
23.5.2 Embedded Enforcement 402
23.5.3 Postprocessing Enforcement 403
23.6 Experimental Setup and Results 404
23.7 Summary and Directions 404
References 405
24 Secure Cloud-Based Information Integration 407
24.1 Overview 407
24.2 Integrating Blackbook with Amazon S3 408
24.3 Experiments 414
24.4 Summary and Directions 414
References 415
Conclusion to Part VI 417
Contents xix
PART VII EXPERIMENTAL CLOUD SYSTEMS FOR
SECURITY APPLICATIONS
25 Cloud-Based Malware Detection for Evolving Data Streams 421
25.1 Overview 421
25.2 Malware Detection 422
25.2.1 Malware Detection as a Data Stream
Classification Problem 422
25.2.2 Cloud Computing for Malware Detection 424
25.2.3 Our Contributions 425
25.3 Related Work 425
25.4 Design and Implementation of the System 428
25.4.1 Ensemble Construction and Updating 428
25.4.2 Error Reduction Analysis 429
25.4.3 Empirical Error Reduction and Time Complexity 430
25.4.4 Hadoop/MapReduce Framework 430
25.5 Malicious Code Detection 432
25.5.1 Overview 432
25.5.2 Nondistributed Feature Extraction and Selection 433
25.5-3 Distributed Feature Extraction and Selection 433
25.6 Experiments 435
25.6.1 Data Sets 435
25.6.1.1 Synthetic Dataset 435
25.6.1.2 Botnet Dataset 436
25.6.1.3 Malware Dataset 436
25.6.2 Baseline Methods 437
25.6.2.1 Hadoop Distributed System Setup 438
25.7 Discussion 438
25.8 Summary and Directions 439References 440
26 Cloud-Based Data Mining for Insider Threat Detection 443
26.1 Overview 443
26.2 Challenges, Related Work, and Our Approach 444
26.3 Data Mining for Insider Threat Detection 445
26.3.1 Our Solution Architecture 445
26.3.2 Feature Extraction and Compact Representation 447
26.3.2.1 Subspace Clustering 448
26.3.3 RDF Repository Architecture 449
26.3.4 Datastorage 450
26.3.4.1 File Organization 451
26.3.4.2 Predicate Split 451
Contents
26.3.4.3 Predicate Object Split 451
26.3.5 Answering Queries Using Hadoop MapReduce 451
26.3.6 Data Mining Applications 452
26.4 Comprehensive Framework 453
26.5 Summary and Directions 455
References 455
Cloud-Centric Assured Information Sharing 457
27.1 Overview 457
27.2 System Design 460
27.2.1 Design of CAISS 460
27.2.1.1 Enhanced Policy Engine 460
27.2.1.2 Enhanced SPARQL Query Processor 461
27.2.1.3 Integration Framework 462
27.2.2 Design of CAISS++ 463'
27.2.2.1 Limitations of CAISS 463
27.2.2.2 Design of CAISS++ 464
27.2.2.3 Centralized CAISS++ 465
27.2.2.4 Decentralized CAISS++ 467
27.2.2.5 Hybrid CAISS-H- 469
27.2.2.6 Naming Conventions 473
27.2.2.7 Vertically Partitioned Layout 473
27.2.2.8 Hybrid Layout 474
27.2.2.9 Distributed Processing of SPARQL 475
27.2.2.10 Framework Integration 476
27.2.2.11 Policy Specification and Enforcement 476
27.2.3 Formal Policy Analysis 476
27.2.4 Implementation Approach 478
27.3 Related Work 478
27.3.1 Our Related Research 478
27.3.1.1 Secure Data Storage and Retrieval in the
Cloud 479
27.3.1.2 Secure SPARQL Query Processing on the
Cloud 479
27.3.1.3 RDF Policy Engine 480
27.3.1.4 AIS Prototypes 481
27.3.1.5 Formal Policy Analysis 482
27.3.2 Overall Related Research 482
27.3.2.1 Secure Data Storage and Retrieval in the
Cloud 482
27.3.2.2 SPARQL Query Processor 482
27.3.2.3 RDF-Based Policy Engine 483
27.3.2.4 Hadoop Storage Architecture 483
Contents xxi
27.3.2.5 Distributed Reasoning 484
27.3.2.6 Access Control and Policy Ontology-Modeling 484
27.3.3 Commercial Developments 484
27.3.3.1 RDF Processing Engines 484
27.3.3.2 Semantic Web-Based Security PolicyEngines 485
27.3.3.3 Cloud 485
27-4 Summary and Directions 485
References 485
28 Design and Implementation ofa Semantic Cloud-Based Assured
Information Sharing System 489
28.1 Overview 489
28.2 Architecture 490
28.2.1 Overview 490
28.2.2 Framework Configuration 490
28.2.3 Modules in our Architecture 491
28.2.3.1 User Interface Layer 492
28.2.3.2 Policy Engines 494
28.2.3.3 Data Layer 500
28.2.4 Features of our Policy Engine Framework 500
28.2.4.1 Policy Reciprocity 500
28.2.4.2 Conditional Policies 501
28.2.4.3 Policy Symmetry 501
28.2.4.4 Develop and Scale Policies 501
28.2.4.5 Justification of Resources 502
28.2.4.6 Policy Specification and Enforcement 503
28.3 Summary and Directions 503
References 503
Conclusion to Part VII 505
PART VIII TOWARD A TRUSTWORTHY CLOUD
29 Trust Management and the Cloud 509
29-1 Overview 509
29.2 Trust Management 510
29.2.1 Trust Management and Negotiation 510
29.2.2 Trust and Risk Management 512
29.2.3 Reputation-Based Systems 513
29.3 Trust and Cloud Services 514
29.3.1 Trust Management as a Cloud Service 514
29.3.2 Trust Management for Cloud Services 516
xxii Contents
29.4 Summary and Directions 517
References 518
30 Privacy and Cloud Services 519
30.1 Overview 519
30.2 Privacy Management 520
30.2.1 Privacy Issues 520
30.2.2 Privacy Problem through Inference 521
30.2.3 Platform for Privacy Preferences 523
30.2.4 Privacy Preserving Cloud Mining 523
30.3 Privacy Management and the Cloud 524
30.3.1 Cloud Services for Privacy Management 524
30.3.2 Privacy for Cloud Services and Semantic Cloud Services.. 525
30.4 Summary and Directions 527
References 527
31 Integrity Management, Data Provenance, and Cloud Services 529
31.1 Overview 529
31.2 Integrity, Data Quality, and Provenance 530
31.2.1 Aspects of Integrity 530
31.2.2 Inferencing, Data Quality, and Data Provenance 531
31.3 Integrity Management and Cloud Services 533
31.3.1 Cloud Services for Integrity Management 533
31.3.2 Integrity for the Cloud and Semantic Cloud Services 535
31.4 Summary and Directions 536
References 537
Conclusion to Part VIII 539
PART IX BUILDING AN INFRASTRUCTURE,AN EDUCATION INITIATIVE, AND A RESEARCH
PROGRAM FOR A SECURE CLOUD
32 An Infrastructure for a Secure Cloud 543
32.1 Overview 543
32.2 Description of the Research Infrastructure 545
32.2.1 Background 545
32.2.1.1 The Need for Our Infrastructure 545
32.2.1.2 Hadoop for Cloud Computing 545
32.2.1.3 Inadequacies of Hadoop 546
32.2.2 Infrastructure Development 547
32.2.3 Hardware Component of the Infrastructure 548
32.2.3.1 Cluster Part of the Hardware Component 548
Contents xxiii
32.2.3.2 Secure Coprocessor Part of the Hardware
Component 550
32.2.4 Software Component of the Infrastructure 551
32.2.4.1 Component Part to Store, Query, and Mine
Semantic Web Data 551
32.2.4.2 Integrating SUN XACML Implementationinto HDFS with IRMs 553
32.2.4.3 Component Part for Strong Authentication 555
32.2.5 Data Component of the Infrastructure 555
32.3 Integrating the Cloud with Existing Infrastructures 556
32.4 Sample Projects Utilizing the Cloud Infrastructure 557
32.5 Education and Performance 558
32.5.1 Education Enhancement 558
32.5.2 Performance 558
32.6 Summary and Directions 559
References 559
An Education Program for a Secure Cloud 563
33.1 Overview 563
33.2 IA Education at UTD 565
33.2.1 Overview of UTD CS 565
33.2.2 Course Offerings in IA 56633.2.3 Our Educational Programs in IA 567
33.2.4 Equipment and Facilities for IA Education and
Research 568
33.3 Assured Cloud Computing Education Program 569
33.3.1 Organization of the Capacity-Building Activities 569
33.3.2 Curriculum Development Activities 570
33.3.2.1 Capstone Course 570
33.3.2.2 Component Insertion into Existing Courses 573
33.3.3 Course Programming Projects 575
33.3.3.1 Fine-Grained Access Control for Secure
Storage 575
33.3.3.2 Flexible Authentication 576
33.3.3.3 Secure Virtual Machine Management 576
33.3.3.4 Secure Co-Processor for Cloud 576
33.3.3.5 Scalable Techniques for Malicious Code
Detection 576
33.3.4 Instructional Cloud Computing Facility 577
33.4 Evaluation Plan 578
33.5 Summary and Directions 578
References 579
xxiv Contents
34 A Research Initiative for a Secure Cloud 581
34.1 Overview 581
34.2 Research Contributions 582
34.2.1 Overview 582
34.2.2 Secure Cloud Data and Information Management 583
34.2.2.1 Data Intensive Secure Query Processing in
the Cloud 583
34.2.2.2 Secure Data Processing in a Hybrid Cloud 583
34.2.2.3 Secure Information Integration in the Cloud....585
34.2.2.4 Secure Social Networking in the Cloud 585
34.2.3 Cloud-Based Security Applications 586
34.2.3.1 Cloud-Based Malware Detection for
Evolving Data Streams 586
34.2.3.2 Cloud-Based Insider Threat Detection 587
. 34.2.3.3 Cloud-Based Assured Information Sharing 587
34.2.4 Security Models for the Cloud 588
34.2.4.1 A Fine-Grained Model for Information Flow
Control in Service Cloud 588
34.2.4.2 CloudMask: Fine-Grained Attribute-Based
Access Control 589
34.2.4.3 Delegated Access Control in the Storage as a
Service Model 590
34.2.4.4 Attribute-Based Group Key ManagementScheme 590
34.2.4.5 Privacy-Preserving Access Control in the
Cloud 591
34.2.5 Toward Building Secure Social Networks in the Cloud....592
34.2.5.1 Secure Social Networking 592
34.2.5.2 Trustworthiness of Data 593
34.2.5.3 Text Mining and Analysis 593
34.3 Summary and Directions 594
References 594
35 Summary and Directions 597
35.1 About This Chapter 597
35.2 Summary of This Book 597
35.3 Directions for Cloud Computing and Secure Cloud Computing .600
35.3-1 Secure Services 600
35.3.2 Cloud Computing 601
35.3.3 Secure Cloud Computing 601
35.4 Our Goals on Securing the Cloud 601
35.5 Where Do We Go from Here? 602
Contents xxv
Conclusion to Part IX 605
Appendix A: Data Management Systems—Developments and Trends 607
Appendix B: Data Mining Techniques 623
Appendix C: Access Control in Database Systems 643
Appendix D: Assured Information Sharing Life Cycle 661
Index 667