Post on 20-Jan-2021
Improve Query Performancewith the Query Log AnalyzerKees VegterField Engineer kees@neo4j.com
Query Log Analyzer
2
Query Log dbms.logs.query.enabled=true# If the execution of query takes more time than this threshold,# the query is logged. If set to zero then all queriesdbms.logs.query.threshold=100msdbms.logs.query.parameter_logging_enabled=truedbms.logs.query.time_logging_enabled=truedbms.logs.query.allocation_logging_enabled=truedbms.logs.query.page_logging_enabled=truedbms.track_query_cpu_time=truedbms.track_query_allocation=true
3
Query Log AnalyzerQuery Analysis
4
Query Log AnalyzerQuery Log: Filter
5
Query Log AnalyzerQuery Log: Highlight
6
Query Log AnalyzerQuery Timeline
7
Cypher Query Processing
Cypher Planning
Cypher ExecutionPhysical Execution Plan
Query String ParseLogical Plan
Physical Execution Plan
Execute Physical Plan in Cypher Runtime
Query Plan Cache
Query String Execute Physical Plan in Cypher Runtime
uses db-statistics
Use queryparameters!
Use repeatable statements!
8
Cypher Execution
Cypher Planning
Query Load
MATCH p=(ah:AccountHolder {fullName :$accountName }) -[:HAS_BANKACCOUNT]->(ba)-[:SEND*2..16]->() WITH p, [x in nodes(p) WHERE x:BankAccount] AS mts UNWIND mts AS mt MATCH p2=(mt)-[:FROM]->()-[:IN_COUNTRY]->() RETURN p, p2 SKIP 0 LIMIT 1000Query Log Analyzer
9
Cypher Planning Cypher Planning
● Parameter Usage○ Check the tool header
○ Check for parameter usage in your queries
● Planning time
1775 queries analysed, 302 distinct queries found.
1775 queries analysed, 1775 distinct queries found.
MATCH (ah:AccountHolder) WHERE ah.fullName = $fullName...RETURN ah
MATCH (ah:AccountHolder) WHERE ah.fullName = "John Smith"...RETURN ah
Cypher Execution
● Page Cache (data cache)
● Waiting for Locks
● Memory Footprint
10
Cypher Execution24 % : read from Cache76 % : read from Disk
● Locking
● Concurrent Load
● Big Result Sets
11
Query Load Query Load
Query Tuning Tips
12
Query Tuning
13
Query Tuning Use Explain and Profile
Things to check:● Index usage● Eager● NodeByLabelScan● AllNodesScan
14
Query TuningAvoid Cartesian Products
…OPTIONAL MATCHOPTIONAL MATCHOPTIONAL MATCH...
MATCH (a), (b), (c)RETURN a, b, c
…UNWIND arrA as aUNWIND arrB as bUNWIND arrC as c...
Use WITH and COLLECT and DISTINCT to reduce the intermediate resultsUse Pattern Comprehension when applicable:
MATCH (a)RETURN
{ a:a, blist : [ (a)-->(b) | {b:b, clist : [(b)-->(c) | c ]], dlist : [ (a)-->(d) | {d:d, elist : [(d)-->(e) | e ]], flist : [ (a)-->(f) | f] }
15
Query Tuning Reduce the query working set as soon as possible
● Can I move a DISTINCT to an earlier point in the query?
● Can I move a LIMIT to an earlier point in the query?
● Can I use COLLECT on places in the query to reduce the amount of rows to be processed?
16
Query Tuning Query Execution
Query Tuning
● Try to send ‘repeatable’ statements
MERGE (author1:Author {id: 1}) MERGE (author2:Author {id: 2})... MERGE (book1:Book {title: "title 1"}) MERGE (book2:Book {title: "title-2"})...MERGE (author1)-[:WROTE]->(book1)MERGE (author2)-[:WROTE]->(book2)...
MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book)
17
Query Tuning Query Execution
Query Tuning
● Reduce the amount of statements you send to Neo4j by using 'batch' statements
UNWIND $inputList as rowMERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)
FOR EVERY 100 ENTRIES IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J
{ inputList : [ { authorId : 1, bookTitle : "title1" } , { authorId : 2, bookTitle : "title2" } ,...] }
MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book)
FOR EVERY ENTRY IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J
{ authorId : 1, bookTitle : "title1" }
18
Query Tuning Query Execution
Query Tuning
● Use apoc.periodic.iterate with the config parameter iterateList : true !
CALL apoc.periodic.iterate( 'CALL apoc.load.jdbc("mydb","SELECT authorId, bookTitle FROM AuthorBooks") YIELD row RETURN row','MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)',{batchSize : 100, iterateList: true })
● kettle also uses this 'batch' approach
19
Tool Usage● The Query Log Analyzer is meant to be used during development and testing!
● When you have only a command prompt available on a neo4j server you can also use the following tool to do a quick analysis of the query.log file:
https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/
This tool wil list the top 10 most expensive queries based upon planning, cpu and waiting time.
20
Next Version● Supports Neo4j version 4 (multi db)
● List Current queries
● List Query Stats (version 3.5.4 and higher)
● Explain Plan
Still under development
21
Multi db support
preview, still under development
22
Current Queries
preview, still under development
23
Queries Stats
preview, still under development
24
Explain Plan
preview, still under development
Useful links
25
Introducing the Query Log Analyzerhttps://medium.com/neo4j/meet-the-query-log-analyzer-30b3eb4b1d6
Cypher Query Optimisationshttps://medium.com/neo4j/cypher-query-optimisations-fe0539ce2e5c
Script to get the top 10 most expensive queries from the command linehttps://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/
Hunger Games Questions for"Improve Query Performance with Query Log Analyzer"
1. Easy: What does Avg Waiting stand for?a. Waiting to execute queryb. Waiting to execute query + waiting for locksc. Waiting for locks
2. Medium: What is the correct order of steps in The Cypher Query Processing a. Query Text > Logical Plan > Parse > Physical Execution Plan > Execute Physical Plan in Cypher Runtime b. Query Text > Parse > Logical Plan > Physical Execution Plan > Execute Physical Plan in Cypher Runtimec. Cache > Physical Execution Plan > Execute Physical Plan in Cypher Runtime
3. Hard: What is the name of config parameter in apoc.periodic.iterate to make batch updates possible?
Answer here: r.neo4j.com/hunger-games
Q & A
27
Query Log Analyzerinstall
https://install.graphapp.io/