Running Neo4j in Production: Tips, Tricks and Optimizations
-
Upload
nick-manning -
Category
Technology
-
view
256 -
download
0
Transcript of Running Neo4j in Production: Tips, Tricks and Optimizations
![Page 1: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/1.jpg)
Running Neo4j in Production
Tips, Tricks and Optimizations
![Page 2: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/2.jpg)
This Talk...
● How we scaled our prod graph
● Challenges faced doing this
● Various lessons we learned and techniques
we used
● Some stuff I’m looking forward to in Neo4j
![Page 3: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/3.jpg)
SNAP Interactive
● Presented by David Fox (Big Data Engineer)
● Social dating app AYI (Are You Interested?)
● Friends and interests
![Page 4: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/4.jpg)
How We Use Neo4j
● Model the friend data of our millions of users
● Indicate connections everywhere on app
● 1.1+ billion nodes
● 8.5+ billion relationships
● 450gb+ store
● 3 instance cluster
![Page 5: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/5.jpg)
Importing lots of data
● Find the right toolo First try normal Cypher
o No good? Bring out the big guns - Java Batch
Inserter
● Java Batch Insertero Sort relationships (GNU sort)
o Try to keep index lookups to in-memory lookups only
Giant HashMap!
![Page 6: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/6.jpg)
But wait!!!
● Cypher CSV importo 2.1 M01
o Supposed to be good for importing large data sets
o Anyone tried it?
![Page 7: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/7.jpg)
Read Querying
● Always try Cypher firsto Performance is being improved
● How can you tell if performance is where you
need it to be?o Time queries (cold vs. warm cache)
o Load testing!
![Page 8: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/8.jpg)
Read Querying cont.
● Dark queryingo Great for benchmarking system where Neo4j
functionality is being injected
o Mitigates risk
o Provides results that are very close to real world
patterns
![Page 9: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/9.jpg)
Read Querying cont.
● Reads too slow? Try these things.o Write high-throughput business-critical queries in
Java
unmanaged extension
faster
hard limits
o Cache shard
country, age, gender, etc.
you hit warm cache more often
![Page 10: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/10.jpg)
Read Querying cont.
● Warm the cache!o Touch all the nodes
o Touch all the relationships
![Page 11: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/11.jpg)
Writing
● Decide which writes need to be synchronous
and which can be asynchronous
● Queue up asynchronous writes (routine
updates, non-vital to immediate user-
experience)o Try to evenly distribute them
o How do we do this? Baserunner!
![Page 12: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/12.jpg)
Baserunner
● Written by SNAP developer
● Walks userbase randomly instead of
sequentiallyo This avoids pockets of heavily increased write
queries
o Allows us to do high-velocity updating of our data
![Page 13: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/13.jpg)
Tuning the JVM
● For a really high-throughput environment,
G1 GC has been very helpfulo Good at adapting itself
o We experienced less system-stopping pauses than
with CMS
o Try CMS first but remember G1 as option
![Page 14: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/14.jpg)
Hardware is Important
● Lots of memory
● Working set too big for memory?o SSDs are helpful
o Optimization techniques discussed become much
more important
![Page 15: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/15.jpg)
Not Everything is Your Fault!
● Like any software, Neo4j has bugs
● Developers are receptive
● File reports on Github when you find issues
![Page 16: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/16.jpg)
Some stuff to look forward to...
● Relationship grouping (2.1 M01)o helps mitigate the super node/dense node problem
● Ronja (rewrite of the Cypher query
language, 2.1?)
● More flexible label index searching (after
2.1)
![Page 17: Running Neo4j in Production: Tips, Tricks and Optimizations](https://reader036.fdocuments.us/reader036/viewer/2022080213/55a065e51a28ab3f728b478c/html5/thumbnails/17.jpg)
Questions?