MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
-
Upload
mongodb -
Category
Data & Analytics
-
view
172 -
download
0
Transcript of MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
![Page 1: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/1.jpg)
ETL for Pros – Getting Data Into MongoDB The Right Way
André Spiegel, PhD Principal Consulting Engineer
![Page 2: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/2.jpg)
#MDBW16
Remember this?
![Page 3: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/3.jpg)
#MDBW16
Sound familiar?
At some point, most applications need to batch-load large amounts of data
• billions of documents • huge initial load • daily updates
![Page 4: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/4.jpg)
#MDBW16
Sound familiar?
Using MongoDB properly means complex documents
{"_id":"admin.mongo_dba","user":"mongo_dba","db":"admin","roles":[{"role":"root","db":"admin"},{"role":"restore","db":"admin"}]}
[{"$sort":{"st":1}},{"$group":{"_id":"$st","start":{"$first":"$ts"},"end":{"$last":"$ts"}}}]
![Page 5: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/5.jpg)
#MDBW16
Sound familiar?
How do I create these documents from relational tables?
![Page 6: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/6.jpg)
#MDBW16
Sound familiar?
How do I do it fast?
Image: Julian Lim
![Page 7: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/7.jpg)
• I've done this for a few years • I've seen people do it • We all make the same mistakes • Let's understand them and come up with something better
![Page 8: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/8.jpg)
Case Study
![Page 9: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/9.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 10: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/10.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 11: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/11.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 12: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/12.jpg)
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ]}
![Page 13: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/13.jpg)
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ]}
![Page 14: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/14.jpg)
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ]}
![Page 15: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/15.jpg)
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ]}
![Page 16: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/16.jpg)
#MDBW16
How do I get from relational to JSON?
ETL Tools: Talend, Pentaho, Informatica, ...
• Gretchen's Question: How do you handle arrays?
![Page 17: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/17.jpg)
#MDBW16
How do I get from relational to JSON?
WYOC (Write Your Own Code) • More challenging,
but you've got ultimate control
![Page 18: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/18.jpg)
#MDBW16
Orders of Magnitude
• Any operation in the CPU is on the order of nanoseconds: 0.000 000 001s • typically tens of nanoseconds per high-level operation
• Any roundtrip to the database is on the order of milliseconds: 0.001s • typically just under 1 millisecond at the minimum
• mostly due to network protocol stack latency
• faster networks don't help
• in-memory storage does not help
![Page 19: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/19.jpg)
A Gallery of Mistakes
![Page 20: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/20.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 21: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/21.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 22: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/22.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 23: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/23.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 24: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/24.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 25: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/25.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 26: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/26.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 27: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/27.jpg)
#MDBW16
Mistake #1 – Nested queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y)
mongodb.insert (doc)
![Page 28: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/28.jpg)
#MDBW16
Results
14.5
0
2
4
6
8
10
12
14
16
Time (min)
Nested Queries
• 1 million orders • 10 million line items • 3 million tracking states • MySQL (local) to MongoDB (local) • Python
![Page 29: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/29.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 30: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/30.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 31: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/31.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 32: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/32.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 33: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/33.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 34: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/34.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 35: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/35.jpg)
#MDBW16
Mistake #2 – Build documents in the database
for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc)
for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
![Page 36: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/36.jpg)
#MDBW16
Results
14.5
95.9
0
20
40
60
80
100
120
Time (min)
Nested Queries Build in DB
![Page 37: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/37.jpg)
#MDBW16
Mistake #3 – Load it all into memory
db_items = SELECT * FROM ITEMSdb_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
![Page 38: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/38.jpg)
#MDBW16
Mistake #3 – Load it all into memory
db_items = SELECT * FROM ITEMSdb_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
![Page 39: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/39.jpg)
#MDBW16
Mistake #3 – Load it all into memory
db_items = SELECT * FROM ITEMSdb_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
![Page 40: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/40.jpg)
#MDBW16
Mistake #3 – Load it all into memory
db_items = SELECT * FROM ITEMSdb_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
![Page 41: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/41.jpg)
#MDBW16
Mistake #3 – Load it all into memory
db_items = SELECT * FROM ITEMSdb_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
![Page 42: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/42.jpg)
#MDBW16
Mistake #3 – Load it all into memory
db_items = SELECT * FROM ITEMSdb_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
![Page 43: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/43.jpg)
#MDBW16
Results
14.5
95.9
8.5
0
20
40
60
80
100
120
Time (min)
Nested Queries Build in DB Lookup from Memory
![Page 44: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/44.jpg)
Getting it Right: Co-Iteration
![Page 45: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/45.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 46: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/46.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 47: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/47.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US"}
![Page 48: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/48.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... } ]}
![Page 49: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/49.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... } ]}
![Page 50: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/50.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ]}
![Page 51: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/51.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ]}
![Page 52: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/52.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ], "tracking" : [ { ... "1985-04-30 09:48:00", ... "ORDERED" } ]}
![Page 53: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/53.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ], "tracking" : [ { ... "1985-04-30 09:48:00", ... "ORDERED" } ]}
![Page 54: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/54.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 55: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/55.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
![Page 56: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/56.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela"}
![Page 57: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/57.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... } ]}
![Page 58: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/58.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ]}
![Page 59: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/59.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ]}
![Page 60: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/60.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" } ]}
![Page 61: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/61.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" }, { ... "1985-04-25 08:30:00", ... "SHIPPED" } ]}
![Page 62: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/62.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" }, { ... "1985-04-25 08:30:00", ... "SHIPPED" }, { ... "1985-05-14 21:37:00", .. "DELIVERED" } ]}
![Page 63: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/63.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{ "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" }, { ... "1985-04-25 08:30:00", ... "SHIPPED" }, { ... "1985-05-14 21:37:00", .. "DELIVERED" } ]}
![Page 64: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/64.jpg)
ORDERS
TRACKING
ITEMS
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
Done!
![Page 65: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/65.jpg)
#MDBW16
Results
14.5
95.9
8.5 8.1
0
20
40
60
80
100
120
Time (min)
Nested Queries Build in DB Lookup from Memory Co-Iteration
![Page 66: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/66.jpg)
#MDBW16
Did you just explain to me what a JOIN is?
• Yes. Although not as straightforward as you might think.
• No. Co-Iteration works from multiple data sources.
NAME ITEM TRACKING
James Bond Aston Martin ORDERED
James Bond Aston Martin SHIPPED
James Bond Dinner Jacket ORDERED
James Bond Dinner Jacket SHIPPED
James Bond Champagne ORDERED
James Bond Champagne SHIPPED
![Page 67: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/67.jpg)
Oh, and one more thing...
![Page 68: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/68.jpg)
#MDBW16
Threading and Batching
batch size
threads
through put
![Page 69: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/69.jpg)
#MDBW16
Results
14.5 9.1
95.9
36.2
8.5 4 8.1 3.9 0
20
40
60
80
100
120
Simple Batch = 1000
Nested Queries Build in DB Lookup from Memory Co-Iteration
![Page 70: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/70.jpg)
#MDBW16
Summary
• Common Mistakes to Watch Out For • Nested Queries • Building Documents in the Database • Loading Everything into Memory
• The Co-Iteration Pattern • Open All Tables at Once • Perform a Single Pass over Them • Build Documents as You Go Along
• Don't Forget Batching and Threading
![Page 71: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/71.jpg)
Thank you.
github.com/drmirror/etlpro
![Page 72: MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way](https://reader034.fdocuments.us/reader034/viewer/2022042706/587064ca1a28ab48378b4b89/html5/thumbnails/72.jpg)
#MDBW16
Market Size
$36 Billion
Partners
1,000+
International Offices
15
Global Employees
575+
Downloads Worldwide
15,000,000+
Make a GIANT Impact www.mongodb.com/careers