Post on 29-Jul-2015
Using MongoDB for Materials Discovery
Michael Kocher and Dan GunterLawrence Berkeley National Lab
Energy Mission at LBNL
• Li-ion Batteries
• Photovoltaic (Solar Cells)
• Thermoelectrics
• Biofuels
• New Computational Tools
• Cutting edge Spectroscopic Tools (Advanced Light Source)
http://carboncycle2.lbl.gov/
Current Material Design model is Slow
18 Years... from the average new materials discovery to commercialization
Bringing New Materials to the Market: Eagar, T.W. Technology Review Feb 1995, 98, 42.
Materials Genome Initiative:A Renaissance of American Manufacturing
“To help businesses discover, develop, and deploy new materials twice as fast, we're launching what we call the
Materials Genome Initiative. The invention of silicon circuits and lithium-ion batteries made computers and iPods
and iPads possible -- but it took years to get those technologies from the drawing board to the marketplace.
We can do it faster.”
- President Obama at Carnegie Mellon University 6/24/2011
What is a Material?
NaCl Silicon
LiCoO2Li
O
Co
What can we Compute using quantum mechanics?
+
No empirical parameters!
volumedensity
total energyformation energy
metallic?etc...
MIT and LBNL collaboration
‘The Google of Material Science Data”MaterialsProject.org
+
Inverting the Problem
Detailed Properties
Machine LearningStructure 1Structure 2Structure 3Structure 4Structure 5Structure 6
materials.bson Learning Algorithm
(new materials)
Prof. Gerbrand Ceder (DOI: 10.1103/PhysRevLett.91.135503)
What about Na, V, P, O?
How often can you substitute Mg for Ca?
Materials Project:A Play in Three Acts
I.Data generation using HTCII. Data storageIII.Data analysis/logging
Act I: Managing Calculations
• Centralized distributed model is the only way to go
• Hub is at LBNL
• Store the state in db
• Overview of running many MPI jobs at many different HP centers
MasterQueue
master_queue.bson
Franklin
NERSC (Oakland)
Lawrencium(Berkeley)
Hopper Carver lr1 lr2
manager.x manager.x manager.x manager.x manager.x
create a new engine, add
to queue
builder.xpull crystal
HPC
‘The Brain’
ExampleMongoDB
FranklinHopper Carver lr1 lr2
manager.x
CathodeO1
MIT
manager.x manager.x manager.x manager.x manager.x manager.x
DLX
manager.x
Centralized Logging and Management
NERSC (Oakland) LBNL Kentucky
query = {‘elements’: {‘$all’: [“Li”, “O”], ‘nelectrons’ :{“$lte: 200}}
Act II :Core Data storage
Very Complex Documents
Powerful Querying
Every crystal that has (Li or Na or K), (Mn), (O or S or F or Si)plus one other element except (Zn or Ni or Fe or Cu or Co)
{"lattice.volume" : { "$lt" : 500 },"elements" : {"$all" : ['Mn'],"$size" : 4, “$nin”:['Zn','Ni','Fe','Cu','Co']},"atoms" : { "$elemMatch" : { ‘oxidation_state’ : 3, ‘symbol’:’Mn’} },"$where" : "match_all(
this.element_names, ['Li', 'Na', 'K'], ['Mn'], ['O', 'S', 'F', 'Si'])"
}
pre-MongoDB :(((SELECT structure.structureid FROM structure NATURAL INNER JOINdatabase NATURAL INNER JOIN databaseentry WHERE structureid IN((select structure.structureid from structure NATURAL INNER JOINelemententry where elemententry.symbol='Li' INTERSECT selectstructure.structureid from structure NATURAL INNER JOIN elemententrywhere elemententry.symbol='O') INTERSECT select structure.structureidfrom structure NATURAL INNER JOIN database NATURAL INNER JOINdatabaseentry where database.title='ICSD')) EXCEPT (SELECTstructure.structureid FROM structure where structure.entryid IN(select duplicateentry.entryid from duplicateentry))) EXCEPT (SELECTstructure.structureid FROM structure where structure.entryid IN(select entryid from removals))
Search for materials with Li and O, excluding duplicates
Map/Reduce
tasks.bson materials.bson
MR
✓Calculation 12Calculation 13Calculation 14Calculation 15
Every App uses MongoDB
by G. Hautier
structure_predictors.bsoncandidate_materials.bson diffraction_patterns.bson
Structure Predictor
Diffraction Pattern
Act III:Analytics and Logging
Rich Error Analysis
Experimental Calculated
Integrated logging just makes sense
• Semi-structured data easily stored
• Can correlate with all other data
• Automation Layer: Failed tasks
• Web/App Layer
Conclusions • MongoDB is a very versatile tool
• Used in several different cases
• Elegant query syntax
• Very useful for scientific data storage
• A lot of exciting future ideas
Acknowledgements
Thanks!
MaterialsProject.org