NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit...
-
Upload
nigel-carroll -
Category
Documents
-
view
212 -
download
0
Transcript of NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit...
![Page 1: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/1.jpg)
NoSQL continuedCMSC 461Michael Wilson
![Page 2: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/2.jpg)
MongoDB MongoDB is another NoSQL solution
Provides a bit more structure than a solution like Accumulo
Data is stored as BSON (Binary JSON) Binary encoded JSON, extends JSON
Allows storage of large amounts of data
![Page 3: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/3.jpg)
SQL vs. MongoDB SQL has databases, tables, rows,
columns Monbo has databases, collections,
documents, fields Both have primary keys, indexes Collection structures are not enforced
heavily Inserts automatically create schemas
![Page 4: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/4.jpg)
Interacting with MongoDB Multiple databases within MongoDB
Switch databases use newDb
New databases will be stored after an insert
Create collection db.createCollection(“collectionName”) Not necessary, collections are implicitly
created on insert
![Page 5: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/5.jpg)
BSON MongoDB uses BSON very heavily
Binary JSON Like JSON with a binary serialization
method Has extensions so that it can represent
data types that JSON cannot Used to represent documents, provide
input to queries
![Page 6: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/6.jpg)
Selects/queries In MongoDB, querying typically consists of
providing an appropriately crafted BSON SELECT * FROM collectionName
db.collectionName.find() SELECT * FROM collectionName WHERE field =
value db.collectionName.find( {field: value} )
SELECT * FROM collectionName WHERE field > 5 db.collectionName.find( {field: {$gt: 5} } )
Other functions that take a query argument have queries that are formatted this way
![Page 7: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/7.jpg)
Interacting with MongoDB Insert
db.collectionName.insert( {queryBSON} ) Update
db.collectionName.update( {queryBSON}, {updateBSON}, {optionBSON} ) updateBSON
Set field to 5: {$set: {field: 5}} Increment field by 1 {$inc: {field: 1}}
optionBSON Options that determine whether or not to create new
documents, update more than one document, write concerns
![Page 8: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/8.jpg)
Interacting with MongoDB Delete
db.collectionName.remove( {queryBSON} )
![Page 9: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/9.jpg)
Apache Hive Also runs on Hadoop, uses HDFS as a
data store Queryable like SQL
Using an SQL-inspired language, HiveQL
![Page 10: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/10.jpg)
Hive data organization Databases Tables Partitions
Tables are broken down into partitions Partition keys allow data to be stored into
separate data files on HDFS Can query on particular partitions
Buckets Can bucket by column to sample data
![Page 11: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/11.jpg)
Purpose of Hive Provide analytics, query large volumes
of data NOT to be used for real time queries like
Postgres or Oracle Hive queries take forever
Partitions and buckets can help reduce this amount of time
![Page 12: NoSQL continued CMSC 461 Michael Wilson. MongoDB MongoDB is another NoSQL solution Provides a bit more structure than a solution like Accumulo Data.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649e7a5503460f94b7ab26/html5/thumbnails/12.jpg)
Hive queries Hive queries actually generate
MapReduce jobs MapReduce jobs take a while to set up
and run MapReduce jobs can be run manually,
but for structured data and analytics, Hive can be used