Fine-grained Scalability of Digital Library Services in...
Transcript of Fine-grained Scalability of Digital Library Services in...
![Page 1: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/1.jpg)
Fine-grained Scalability ofDigital Library Services in the
Cloud
Lebeko Poulo, Lighton Phiriand Hussein Suleman
Digital Libraries LaboratoryDepartment of Computer Science
University of Cape Town
![Page 2: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/2.jpg)
Research Overview
� Digital Libraries (DLs) and Digital LibrarySystems (DLSes)
� Research objectives� Develop techniques for building scalable digital
information management systems based on efficientand on-demand use of generic grid-basedtechnologies
� Explore the use of existing cloud computingresources
� Research questions� Can a typical DL architecture be layered over an
on-demand paradigm such as cloud computing?� Is there linear scalability with increasing data and
service capacity needs?
![Page 3: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/3.jpg)
How Quickly Does Data Scale?
� Extent of data scalability� Data growth rates estimated at 40% per year� By 2020, data volumes will have grown to 44 times
the 2009 size
![Page 4: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/4.jpg)
Scaling Digital Library Systems
� Key criteria for design/implementation of DLSes� Scalability� Preservation
� The promise of cloud computing proven manytimes
� Feasibility of migrating and hosting DLs evident
� Investigation of deep integration of DL serviceswith cloud services required
� Investigate efficacy of DL cloud adoption� Verify extent of unlimited scale� Maximise potential for cloud-service-level scalability
![Page 5: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/5.jpg)
Prototype DLS - Design
� RQ #1—Can a typical DL architecture belayered over an on-demand paradigm?
� Prior work on potential architectural designs forutility clouds
� Emulation of parallel programming architectures� Utility computing offers flexibility of multiple
architectural models� Potential architectures for scalable utility services
� Two architectural patterns adopted as basis fordesign of prototype architecture
� Proxy architectures� Some aspects of Client-side architecture
![Page 6: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/6.jpg)
Prototype DLS - Architecture
Browse Module
AmazonS3
Buckets
AmazonEBS
AmazonSimpleDB
Search Module
Domains
OAI-PMH Harvester Module
InstanceA
InstanceB
InstanceC
InstanceD
Amazon EC2
REST API
![Page 7: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/7.jpg)
Prototype DLS - Services
Browse Module Search Module OAI-PMH Harvester Module
Web User Interface
� Two typical DL services, accessible via publiclyavailable Light-weight process Web interface
� Browse module—enable access through gradualrefinement
� Search module—enable access through searchqueries
� OAI-PMH endpoint used to ingest data intocollections
![Page 8: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/8.jpg)
Prototype DLS - Application Server
InstanceA
InstanceB
InstanceC
InstanceD
Amazon EC2
REST API
� Amazon Elastic Compute Cloud (EC2) toprovide sizeable computing capacity
� 32-bit Ubuntu Amazon Machine Images (AMIs)� Glassfish 3.1� Prototype DLS
![Page 9: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/9.jpg)
Prototype DLS - Data Storage
AmazonS3
Buckets
AmazonEBS
AmazonSimpleDB
Domains
REST API
� Amazon Simple Storage Service (S3) for storageand retrieval of large numbers of data objects
� Amazon SimpleDB for querying storedstructured data
� Amazon Elastic Block Store (EBS) to enablestorage persistence of EC2 instances
![Page 10: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/10.jpg)
Evaluation - Experimental Design� RQ #2—Is there linear scalability with increasing
capacity needs?� Goals
� Evaluate potential scalability advantages associatedwith cloud-based DLs
� Evaluation aspects� Data/service scalability and load testing
� Workload� Number of user requests, number of users and
collection sizes
� Metrics� Response time
� Factors� EC2 instances, users, requests, collection size
![Page 11: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/11.jpg)
Evaluation - Experimental Setup
� Test dataset—NDLTD and NETD portals� Ingested using OAI-PMH harvester module
� Execution environment� All experimental test conducted on EC2 cloud
infrastructure� EC2 instance of type t1.micro used for
server-side processing� 32-bit Ubuntu Amazon Machine Image (AMI)
configuration
� Apache JMeter used to simulate user requests� All measurement results based on five-run
averages
![Page 12: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/12.jpg)
Experiment #1 - Service Scalability
� Determine the time taken for browse and searchservice requests
� Assess impact due to variation of multiple serverfront-ends
� Methodology� JMeter used to simulate 50 users for each Web
service, ten times� Web services hosted on four identical EC2 instances� Experiments repeated at least five times for each
service criteria� Comparative analysis—browsing categories for
browse service—by partitioning requests into blocksof 50
![Page 13: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/13.jpg)
Experiment #1 - Browse Service
600
800
1000
1200
1-5
0
51-1
00
101
-150
151
-200
201
-250
251
-300
301
-350
351
-400
401
-450
451
-500
Number of requests
Res
pons
etim
e(m
s)
Browsing by title Browsing by date Browsing by author
![Page 14: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/14.jpg)
Experiment #1 - Browse Service (2)
600
900
1200
1500
1356
9251
1112
313
5692
5111
538
1356
9251
1193
413
5692
5112
362
1356
9251
1276
513
5692
5113
081
1356
9251
1349
113
5692
5113
821
1356
9251
1411
813
5692
5114
515
1356
9251
1475
713
5692
5115
156
1356
9251
1545
513
5692
5115
761
1356
9251
1607
513
5692
5116
568
1356
9251
1678
913
5692
5117
103
1356
9251
1752
813
5692
5117
749
1356
9251
1817
813
5692
5118
446
1356
9251
1874
213
5692
5119
092
1356
9251
1942
5
Timestamp
Res
pons
etim
e(m
s)
Browse by author Browse by date Browse by title
![Page 15: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/15.jpg)
Experiment #1 - Browse Service (3)
500
600
700
1-5
0
51-1
00
101
-150
151
-200
201
-250
251
-300
301
-350
351
-400
401
-450
451
-500
Number of requests
Tim
e/bl
ock
(ms)
1 instance 2 instances 3 instances 4 instances
![Page 16: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/16.jpg)
Experiment #2 - Data Scalability
� Determine service performance for varyingcollection sizes for fixed number of servers
� Ascertain if application can cope with increasingdata volumes in DL collections
� Methodology� JMeter set up to simulate 50 users accessing a Web
service ten times� Fixed number of identical servers with collection
sizes of 4k, 8k, 16k and 32k records� Experiments repeated at least five times for each
service� Comparative analysis by partitioning requests into
blocks of 50
![Page 17: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/17.jpg)
Experiment #2 - Browse Service
800
900
1000
1100
1-5
0
51-1
00
101
-150
151
-200
201
-250
251
-300
301
-350
351
-400
401
-450
451
-500
Number of requests
Res
pons
etim
e(m
s)
4000 8000 16000 32000
![Page 18: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/18.jpg)
Experiment #3 - Load Testing
� Determine volume of requests application couldprocess for increasing concurrent users
� Methodology� JMeter set up to varying number of users accessing
a Web service� Fixed number of identical servers used� Initially simulate five users, each accessing a Web
service ten times� Subsequent simulation of 20, 50, 100, 250 and 500
users� Experiments repeated at least five times for each
service
![Page 19: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/19.jpg)
Experiment #3 - All Services
700
800
900
1000
5 20 50 100
250
500
Number of users
Res
pons
etim
e(m
s)
Search operation Browse operation
![Page 20: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/20.jpg)
Conclusion
� Key findings� Redesign of application architectural components to
conform to cloud service architecture� Results indicate that response times are not
significantly affected by request complexity,collection size or request sequencing
� Noticeable time taken to connect to AWS—ramp uptime
� Study Limitations� Single EC2 instance type—t1.micro—used� Cloud service vendor� Experimental dataset size� Query optimisation� Synthetic load used
![Page 21: Fine-grained Scalability of Digital Library Services in ...lightonphiri.org/wp-content/uploads/.../talks-saicsit14-fine_grained.pdf · Ingested using OAI-PMH harvester module Execution](https://reader033.fdocuments.us/reader033/viewer/2022060422/5f189e0f66d4f466685f7f46/html5/thumbnails/21.jpg)
Bibliography
Hussein Suleman (2009).Utility-based High Performance Digital Library Systems.
Pradeep Teregowda et al. (2010).Cloud Computing: A Digital Libraries Perspective.
Pradeep Teregowda et al. (2010).CiteSeerx: A Cloud Perspective.
Byung Chul Tak et al. (2011).To Move or Not to Move: The Economics of CloudComputing.
Jinesh Varia (2011).
Architecting for The Cloud: Best Practices.