Yannis papakonstantinou sql++ query language for semi-structured data
description
Transcript of Yannis papakonstantinou sql++ query language for semi-structured data
SQL++%by%Kian%Win%Ong,%Yannis%Papakonstan7nou,%Romain%Vernoux
Use%of%SQL++%in%virtual%database%and%live%analy7cs%systems%by%Yupeng%Fu,%Michalis%Petropoulos,%Gaurav%Saxena,%Alok%Singh%and%others…%
with%the%support%of%Na7onal%Science%Founda7on,%Google,%Informa7ca%
you%will%have%it%all:% JSON%%
+%SQL%declara7ve%analy7cs%%+%top%scalable%performance
SQL++%Query%Language%
• What%is%SQL++%?
• How%we%use%it%in%UCSD’s%FORWARD%? • SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons
• Introduc7on%to%the%SQL++%formal%specifica7on%paper%
• Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop
• The%N1QL%connec7on • The%SQL++%industrial%query%language%twin
Outline%
SQL++%semiYstructured% data%model: superset%of%
JSON%and%SQL Declara3ve%%
query&language:%aggressively%backwards;compa3ble%with%SQL%
SQL++%semiYstructured% data%model: superset%of%
JSON%and%SQL Declara3ve%%
query&language:%aggressively%backwards;compa3ble%with%SQL%
Formal&Seman/cs:&adjus3ng%tuple%calculus%to%the%richness%of%JSON%%
Automa/c&Query&Op/miza/on&&&Execu/on:&FORWARD%virtual%database%
query%plans%based%on%%SQL++%algebra%
Automa/c&Incremental&View&Maintenance&
A%superset%of%SQL%and%JSON
• JSON%+%bags%+%enriched%values%+%… • SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…
SQL++%Data%Model%(also%JSON++%data%model)%
A%superset%of%SQL%and%JSON
• JSON%+%bags%+%enriched%values%+%… • SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…
{""
""location:"'Alpine',"
""readings:"["
""""{""
""""""time:"timestamp('2014;03;12T20:00:00'),"
""""""ozone:"0.035,"
""""""no2:"0.0050"
""""},"
""""{""
""""""time:"timestamp('2014;03;12T22:00:00'),"
""""""ozone:"'m',"
""""""co:"0.4"
""""}"]"
}"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
SQL++%Data%Model%(also%JSON++%data%model)%
A%superset%of%SQL%and%JSON
• JSON%+%bags%+%enriched%values%+%… • SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…
{""
""location:"'Alpine',"
""readings:"["
""""{""
""""""time:"timestamp('2014;03;12T20:00:00'),"
""""""ozone:"0.035,"
""""""no2:"0.0050"
""""},"
""""{""
""""""time:"timestamp('2014;03;12T22:00:00'),"
""""""ozone:"'m',"
""""""co:"0.4"
""""}"]"
}"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
• Collec7on%nested%inside%a%tuple%
• Heterogeneous%collec7on%elements%
• Permissive%schema%or%schemaYless%
• TopYlevel%value%may%not%be%a%collec7on
SQL++%Data%Model%(also%JSON++%data%model)%
A%superset%of%SQL%and%JSON
• JSON%+%bags%+%enriched%values%+%… • SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…
{""
""location:"'Alpine',"
""readings:"["
""""{""
""""""time:"timestamp('2014;03;12T20:00:00'),"
""""""ozone:"0.035,"
""""""no2:"0.0050"
""""},"
""""{""
""""""time:"timestamp('2014;03;12T22:00:00'),"
""""""ozone:"'m',"
""""""co:"0.4"
""""}"]"
}"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
• Collec7on%nested%inside%a%tuple%
• Heterogeneous%collec7on%elements%
• Permissive%schema%or%schemaYless%
• TopYlevel%value%may%not%be%a%collec7on
SQL++%Data%Model%(also%JSON++%data%model)%
A%superset%of%SQL%and%JSON
• JSON%+%bags%+%enriched%values%+%… • SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…
{""
""location:"'Alpine',"
""readings:"["
""""{""
""""""time:"timestamp('2014;03;12T20:00:00'),"
""""""ozone:"0.035,"
""""""no2:"0.0050"
""""},"
""""{""
""""""time:"timestamp('2014;03;12T22:00:00'),"
""""""ozone:"'m',"
""""""co:"0.4"
""""}"]"
}"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
• Collec7on%nested%inside%a%tuple%
• Heterogeneous%collec7on%elements%
• Permissive%schema%or%schemaYless%
• TopYlevel%value%may%not%be%a%collec7on
SQL++%Data%Model%(also%JSON++%data%model)%
1. Extended%SQL%syntax/seman7cs%for%heterogeneity,%%complex%values • Nes7ng%and%Unnes7ng%for%crea7ng/accessing%nested%collec7ons • Element%variables%(as%opposed%to%tuple%variables) • Mul7ple%forms%of%naviga7on
• corresponding%to%the%mul7ple%complex%types%
2. Queries%that%input/output%any%JSON%type%(not%only%collec7ons%of%tuples)%
3. Virtual%database%(unifying%database)%features • Establish%universal%access%to%SQL%and%NoSQL%databases% • Are%important%to%survey%of%SQL%and%NoSQL%databases
The%SQL++%Extensions%to%SQL%
• What%is%SQL++%?
• How%we%use%it%in%UCSD’s%FORWARD%? • SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons
• Introduc7on%to%the%SQL++%formal%specifica7on%paper%
• Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop
• The%N1QL%connec7on • The%SQL++%industrial%query%language%twin
Outline%
SQL++&Query&Processor&
Client%
FORWARD&Virtual&Database%
SQL++%based%Virtual%Database%
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
FORWARD&Virtual&Database%
SQL++%based%Virtual%Database%
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
SQL++&Virtual&Views&of&Sources&
FORWARD&Virtual&Database%
SQL++%based%Virtual%Database%
SQL Wrapper
NewSQL Wrapper
NoSQL Wrapper
SQLYonYHadoop Wrapper
Java%Objects%Wrapper
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
SQL++&Virtual&Views&of&Sources&
Federated%SQL++%Queries
FORWARD&Virtual&Database%
SQL++%based%Virtual%Database%
SQL Wrapper
NewSQL Wrapper
NoSQL Wrapper
SQLYonYHadoop Wrapper
Java%Objects%Wrapper
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
Na7ve queries
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
SQL++&Virtual&Views&of&Sources&
Federated%SQL++%Queries SQL++%Results
FORWARD&Virtual&Database%
SQL++%based%Virtual%Database%
SQL Wrapper
NewSQL Wrapper
NoSQL Wrapper
SQLYonYHadoop Wrapper
Java%Objects%Wrapper
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
Na7ve queries
Na7ve results
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
SQL++&Virtual&Views&of&Sources&
Federated%SQL++%Queries SQL++%Results
FORWARD&Virtual&Database%
SQL++%based%Virtual%Database%
SQL Wrapper
NewSQL Wrapper
NoSQL Wrapper
SQLYonYHadoop Wrapper
Java%Objects%Wrapper
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
Na7ve queries
Na7ve results
SQL++%&%FORWARD%predecessor: • %OEM%semiYstructured%model • %Virtual%database%work%since%90s%(>7000%cita7ons)% =>%Enosys%XML%virtual%database =>%Sold%via%BEA%Aqualogic%(2001)
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
SQL++%Queries SQL++%Results
FORWARD&Integra/on&Database%
SQL++%Virtual%and/or%Materialized%Views%
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++&Query&Processor&
SQL Database
NewSQL Database
NoSQL Database
SQLYonYHadoop Database
Client%
Java% InYMemory%Objects
SQL++%Queries SQL++%Results
FORWARD&Integra/on&Database%
SQL++%Virtual%and/or%Materialized%Views%
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++ Virtual View
SQL++&Integrated&Views&
SQL++ Integrated
View
SQL++ Integrated
View
SQL++ Added%Value
View
SQL++%View%Defini7ons
"measurements:"{{"
"""{"sid:"2,"temp:"[70.1,"70.2]"},""
"""{"sid:"1,"temp:"[71.0]"}""
"}}"
BI&Tool&/&Report&Writer&
Couchbase&
FORWARD&Virtual&Database%
Use%Case%1:%%Hide%the%++%source%features%behind%SQL%views%
"measurements:"{{"
"""{"sid:"2,"temp:"[70.1,"70.2]"},""
"""{"sid:"1,"temp:"[71.0]"}""
"}}"
BI&Tool&/&Report&Writer&
Couchbase&
""""
FORWARD&Virtual&Database%
or&its&SQL++&equivalent&
sid" temp"
2" 70.1"
2" 70.2"
1" 71.0"
measurements:"{{""
"{sid:"2,"temp:"70.1},"
"{sid:"2,"temp:"70.2},"
"{sid:"1,"temp:"71.0}""
}}"
Use%Case%1:%%Hide%the%++%source%features%behind%SQL%views%
SQL&View&
measurements:"
"measurements:"{{"
"""{"sid:"2,"temp:"[70.1,"70.2]"},""
"""{"sid:"1,"temp:"[71.0]"}""
"}}"
BI&Tool&/&Report&Writer&
Couchbase&
""""
FORWARD&Virtual&Database%
or&its&SQL++&equivalent&
SQL%query SQL%result
sid" temp"
2" 70.1"
2" 70.2"
1" 71.0"
Couchbase%query Couchbase%results
measurements:"{{""
"{sid:"2,"temp:"70.1},"
"{sid:"2,"temp:"70.2},"
"{sid:"1,"temp:"71.0}""
}}"
Use%Case%1:%%Hide%the%++%source%features%behind%SQL%views%
SQL&View&
measurements:"
measurements:"{{"
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"
}}"
MongoDB&
Use%Case%2:%%Capabili3es%and%Seman3cs;Aware%Pushdown%
MongoDB&Virtual&Database&
Virtual%View (Iden7cal)
MongoDB&Wrapper&
"Sensors%that%recorded%a%temperature%below%50"
measurements:"{{"
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"
}}"
MongoDB&
SQL++%query SQL++%result
Use%Case%2:%%Capabili3es%and%Seman3cs;Aware%Pushdown%
MongoDB&Virtual&Database&
Virtual%View (Iden7cal)
MongoDB&Wrapper&
SELECT"DISTINCT"m.sid"FROM"""measurements"AS"m"WHERE""m.temp"<"50"
"Sensors%that%recorded%a%temperature%below%50"
SQL++%seman7cs%for% m.temp"<"50"
measurements:"{{"
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"
}}"
MongoDB&
SQL++%query SQL++%result
MongoDB%query MongoDB%results
Use%Case%2:%%Capabili3es%and%Seman3cs;Aware%Pushdown%
MongoDB&Virtual&Database&
Virtual%View (Iden7cal)
MongoDB&Wrapper&
SELECT"DISTINCT"m.sid"FROM"""measurements"AS"m"WHERE""m.temp"<"50"
"Sensors%that%recorded%a%temperature%below%50"
SQL++%seman7cs%for% m.temp"<"50"
..."
{"$match:"{temp:"{$lt:""50}}},"{"$match:"{temp:"{$not:"null}}}"..."
MongoDB%seman7cs%for temp"$lt"50"
• Sources%like%SQL%and%MongoDB%cannot%execute%the%en7rety%of%SQL++
• How%to%efficiently%push%down%computa7on?
• How%to%simulate%incompa7ble%seman7cs/missing%features?
Issues%automa7cally%handled%by%the%query%processor.
Plenty%of%query%rewri7ng%problems,%including%novel%ones%on%semiYstructured%operators
Push;Down%Challenges:%Limited%capabili3es%&%seman3c%varia3ons%
• Seman7cs%of%"lessYthan"%comparison%are%different%across%sources:%
%<sql%,%<mongodb%,%etc.
• Config%Parameters%to%capture%these%varia7ons
SQL++%captures%Seman3cs%Varia3ons%
@lt:{:MongoDB:}:("x"<"y")"
@lt:{:""""complex:::::::"boolean,"""""type_mismatch:"false,"""""null_lt_null::"false,"""""null_lt_value:"boolean,"""""..."
}"("x"<"y")"
In%NoSQL,%NewSQL,%SQLYonYHadoop%varia7on%of%seman7cs%for:
• Paths
• Equality
• Comparisons
And%all%the%operators%that%use%them:
• Selec7ons
• Grouping
• Ordering
• Set%opera7ons
Each%of%these%features%has%a%set%of%config%parameters%in%SQL++
Seman3cs%Varia3ons%
SQL++&Query&Processor&
Use%Case%3:%Integrated%Query%Processing%
"
sensors:""
"
"
PostgreSQL&
"measurements:"["
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"]"
MongoDB&
id: lat: lng:1" 32.8" ;117.1"
2" 32.7" ;117.2"
SQL++&Query&Processor&
Use%Case%3:%Integrated%Query%Processing%
"
sensors:""
"
"
PostgreSQL&
"measurements:"["
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"]"
MongoDB&
id: lat: lng:1" 32.8" ;117.1"
2" 32.7" ;117.2"
SQL++&Query&Processor&
sensors:"{{""
"{id:1,"lat:32.8,"lng:;117.1},"
"{id:2,"lat:32.7,"lng:;117.2}"}}"
SQL&Wrapper&
measurements:"[""
"{"sid:"2,"temp:"70.1"},"
"{"sid:"2,"temp:"49.2"},"
"{"sid:"1,"temp:"null"}"]"
MongoDB&Wrapper&
SQL&Virtual&Database& MongoDB&Virtual&Database&
Use%Case%3:%Integrated%Query%Processing%
"
sensors:""
"
"
PostgreSQL&
"measurements:"["
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"]"
MongoDB&
id: lat: lng:1" 32.8" ;117.1"
2" 32.7" ;117.2"
SQL++&Query&Processor&
sensors:"{{""
"{id:1,"lat:32.8,"lng:;117.1},"
"{id:2,"lat:32.7,"lng:;117.2}"}}"
SQL&Wrapper&
measurements:"[""
"{"sid:"2,"temp:"70.1"},"
"{"sid:"2,"temp:"49.2"},"
"{"sid:"1,"temp:"null"}"]"
MongoDB&Wrapper&
SQL&Virtual&Database& MongoDB&Virtual&Database&
SQL++
"Sensors%in%a%given%area%that%recorded%a%low%temperature?"
Use%Case%3:%Integrated%Query%Processing%
"
sensors:""
"
"
PostgreSQL&
"measurements:"["
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"]"
MongoDB&
id: lat: lng:1" 32.8" ;117.1"
2" 32.7" ;117.2"
SQL++&Query&Processor&
sensors:"{{""
"{id:1,"lat:32.8,"lng:;117.1},"
"{id:2,"lat:32.7,"lng:;117.2}"}}"
SQL&Wrapper&
measurements:"[""
"{"sid:"2,"temp:"70.1"},"
"{"sid:"2,"temp:"49.2"},"
"{"sid:"1,"temp:"null"}"]"
MongoDB&Wrapper&
SQL&Virtual&Database& MongoDB&Virtual&Database&
SQL++
SQL++ SELECT"s.lat,"s.lng,"m.temp""FROM"""sensors"AS"s""JOIN"""measurements"AS"m"":::::::ON"s.id"="m.sid"WHERE""(s.lat">"32.6"AND"s.lat"<"32.9"":::::::AND"s.lng">";117.0"AND"s.lng"<";117.3)""""""""AND:m.temp"<"50"
"Sensors%in%a%given%area%that%recorded%a%low%temperature?"
Use%Case%3:%Integrated%Query%Processing%
"
sensors:""
"
"
PostgreSQL&
"measurements:"["
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"]"
MongoDB&
id: lat: lng:1" 32.8" ;117.1"
2" 32.7" ;117.2"
SQL++&Query&Processor&
sensors:"{{""
"{id:1,"lat:32.8,"lng:;117.1},"
"{id:2,"lat:32.7,"lng:;117.2}"}}"
SQL&Wrapper&
measurements:"[""
"{"sid:"2,"temp:"70.1"},"
"{"sid:"2,"temp:"49.2"},"
"{"sid:"1,"temp:"null"}"]"
MongoDB&Wrapper&
SQL&Virtual&Database& MongoDB&Virtual&Database&
SQL++ SQL++
SQL++ SELECT"s.lat,"s.lng,"m.temp""FROM"""sensors"AS"s""JOIN"""measurements"AS"m"":::::::ON"s.id"="m.sid"WHERE""(s.lat">"32.6"AND"s.lat"<"32.9"":::::::AND"s.lng">";117.0"AND"s.lng"<";117.3)""""""""AND:m.temp"<"50"
"Sensors%in%a%given%area%that%recorded%a%low%temperature?"
SQL MongoDB
Use%Case%3:%Integrated%Query%Processing%
"
sensors:""
"
"
PostgreSQL&
"measurements:"["
"""{"sid:"2,"temp:"70.1"},"
"""{"sid:"2,"temp:"49.2"},""
"""{"sid:"1,"temp:"null"}"]"
MongoDB&
id: lat: lng:1" 32.8" ;117.1"
2" 32.7" ;117.2"
SQL++&Query&Processor&
sensors:"{{""
"{id:1,"lat:32.8,"lng:;117.1},"
"{id:2,"lat:32.7,"lng:;117.2}"}}"
SQL&Wrapper&
measurements:"[""
"{"sid:"2,"temp:"70.1"},"
"{"sid:"2,"temp:"49.2"},"
"{"sid:"1,"temp:"null"}"]"
MongoDB&Wrapper&
SQL&Virtual&Database& MongoDB&Virtual&Database&
SQL++ SQL++
SQL++ SELECT"s.lat,"s.lng,"m.temp""FROM"""sensors"AS"s""JOIN"""measurements"AS"m"":::::::ON"s.id"="m.sid"WHERE""(s.lat">"32.6"AND"s.lat"<"32.9"":::::::AND"s.lng">";117.0"AND"s.lng"<";117.3)""""""""AND:m.temp"<"50"
"Sensors%in%a%given%area%that%recorded%a%low%temperature?"
SQL MongoDB
db.measurements.aggregate("""{"$match:"{temp:"{$lt:"50}}"},""{"$match:"{temp:"{$not:"null}}"}")"
Use%Case%3:%Integrated%Query%Processing%
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
• The%Incremental%View%Maintenance%func7onality: • SQL++%(Materialized)%View%Defini7on%%%%%%
%
Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"
DB Before
View Before
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
• The%Incremental%View%Maintenance%func7onality: • SQL++%(Materialized)%View%Defini7on%%%%%%
• Stream%of%inserts,%deletes,%updates%on%base%data%
Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"
DB Before
DB Aqer
View Before
Stream
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
• The%Incremental%View%Maintenance%func7onality: • SQL++%(Materialized)%View%Defini7on%%%%%%
• Stream%of%inserts,%deletes,%updates%on%base%data%
Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"
DB Before
DB Aqer
View Before
View Aqer
Stream
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
• The%Incremental%View%Maintenance%func7onality: • SQL++%(Materialized)%View%Defini7on%%%%%%
• Stream%of%inserts,%deletes,%updates%on%base%data%
• The%Incremental%View%Maintenance%module% • Automa7cally%and%efficiently%updates%the%materialized%view%to%reflect%the%stream%of%changes%
Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"
DB Before
DB Aqer
View Before
View Aqer
Stream
IVM Stream
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
• The%Incremental%View%Maintenance%func7onality: • SQL++%(Materialized)%View%Defini7on%%%%%%
• Stream%of%inserts,%deletes,%updates%on%base%data%
• The%Incremental%View%Maintenance%module% • Automa7cally%and%efficiently%updates%the%materialized%view%to%reflect%the%stream%of%changes%
• SQL++%can%also%enable%automa7c%Incremental%View%Maintenance!% • With%aten7on%to%replica7on%of%data%in%views • Opportuni7es%by%keys
Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"
DB Before
DB Aqer
View Before
View Aqer
Stream
IVM Stream
Custom%dashboards,%interac7ve%pages%&%apps
• The%data%models%of%visualiza7on%components%(e.g.%Google%Maps)%can%be%nicely%captured%with%JSON%models
• The%pages%are%SQL++%(JSON)%views! • Mashups%of%the%components%views
• SQL++%feeds%and%incrementally%updates%the%page%views
%Use%case
• From%data%to%visualiza7on%with%just%SQL++%&%markup • Ajax/Javascript%visuals%with%no%Ajax/Javascript%mess • How%to%easily%connect%to%today’s%JS%libraries
• Custom%Ajax%visualiza7ons%&%interfaces%for%IT%personnel
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
(part%of)%the%Google%Map%model%
<%:unit:google.maps.Maps:%>:""{"
""""markers:"["{"
""""""position:"{""
""""""""latitude":"number,""""""""""longitude:"number:""""""}"
""""""..."
""""}"]""
""}"
<%:end:unit:%>:
FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%
• What%is%SQL++%?
• How%we%use%it%in%UCSD’s%FORWARD%? • SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons
• Introduc7on%to%the%SQL++%formal%specifica7on%paper%
• Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop
• The%N1QL%connec7on • The%SQL++%industrial%query%language%twin
Outline%
FROM"readings"AS"r"
SELECT:r"AS"co"
WHERE:r"<"1.0:
B!=!{{!⟨!r!:!1.3!⟩, !!!!!!!!!!!!!⟨!r!:!0.7!⟩, !!!!!!!!!!!!!⟨!r!:!0.3!⟩, !!!!!!!!!!!!!⟨!r!:!0.8!⟩!!}}
B'#=!{{#⟨!r!:!0.7!⟩, !!!!!!!!!!!!!!⟨!r!:!0.3!⟩, ##############⟨!r!:!0.8!⟩!}} ["
"{"co:"0.8"},"
"{"co:"0.7"}"
]"
ℾ!=!⟨! "readings!:! ""[""
"""1.3,"
"""0.7,"
"""0.3,"
"""0.8"
""]!"⟩
Result&Environment& Query&⊢ →
ORDER:BY:r"DESC::
LIMIT"2:
B''!=![!⟨!r!:!0.8!⟩, !!!!!!!!!!!!!⟨!r!:!0.7!⟩, !!!!!!!!!!!!!⟨!r!:!0.3!⟩!]
B'''#=![!⟨!r!:!0.8!⟩ !!!!!!!!!!!!!!⟨!r!:!0.7!⟩!]
How%to%read%the%SQL++%syntax%and%seman3cs%
Formal%specifica7on%paper:
htp://arxiv.org/abs/1405.3631
• Data%model%(Sec7ons%3,%3.1)
• Element%variables%(Sec7on%4.1)
• SELECTYFROMYWHERE%queries%as%element%variable%generators
FROM"""sensors"AS"s"{{"
"{"sensor"":"1,"
"""readings:"{{"{co:0.4},"{co:0.2}"}}"
"},"
"
"{"sensor"":"2,"
"""readings:"{{"{co:0.3}"}}"
"},"
}}"
Result&
b1!=!⟨!s":"{sensor:"1}!⟩ b2!=!⟨!s":"{sensor:"2}!⟩""
Environment&
ℾ0!=!⟨! "sensors!:!{{""""{sensor:"1},"
""{sensor:"2}"
"}}!, !!logs:!{{"""{sensor:"1,"co:"0.4},"
""{sensor:"1,"co:"0.2},"
""{sensor:"2,"co:"0.3},"
"}}"
⟩
Query&
SELECT:TUPLE:s.sensor"AS"sensor,""""""""""""""("SELECT"TUPLE"l.co"AS"co""""""""""""""""FROM"""logs"AS"l"""""""""""""""""""""""""WHERE""l.sensor"="s.sensor""""""""""""""")"AS"readings"
⊢ →
FROM"""logs"AS"l"
WHERE::l.sensor"="s.sensor"
ℾ1!=!⟨! "s!:!{sensor:"1}!, "sensors!:!{{""""{sensor:"1},"
""{sensor:"2}"
"}}!, !!logs:!{{"""{sensor:"1,"co:"0.4},"
""{sensor:"1,"co:"0.2},"
""{sensor:"2,"co:"0.3},"
"}}"
⟩
b'1!=!⟨!l":"{sensor:"1,"co:"0.4}!⟩ b'2!=!⟨!l":"{sensor:"1,"co:"0.2}!⟩""b'3!=!⟨!l":"{sensor:"2,"co:"0.3}!⟩""
SELECT:TUPLE"l.co"AS"co"
b''1!=!⟨!l":"{sensor:"1,"co:"0.4}!⟩ b''2!=!⟨!l":"{sensor:"1,"co:"0.2}!⟩
{{"{co:0.4},"{co:0.2}"}}"
Result&Environment& Query&⊢ →
• What%is%SQL++%?
• How%we%use%it%in%UCSD’s%FORWARD%? • SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons
• Introduc7on%to%the%SQL++%formal%specifica7on%paper%
• Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop
• The%N1QL%connec7on • The%SQL++%industrial%query%language%twin
Outline%
SQL-on-Hadoop
PIG
Jaql
CQL
N1QL
AQL
MongoDB driver
SQLYonYHadoop
SQL%&%NewSQL
NoSQL
Others
Surveyed%Databases%
• SQL++%covers%SQL,%N1QL%and%QL%research%prototypes%(e.g.,%UCI’s%ASTERIX)%
• Removing%the%current%“Tower%of%Babel”%effect
• Providing%formal%syntax%and%seman7cs%
SQL++%Removes%Superficial%Differences%
SELECT:AVG(temp)"AS"tavg""FROM"readings""GROUP:BY"sid"
SQL
db.readings.aggregate("""{$group:"{_id:""$sid",""""tavg:"{$avg:"$temp"}}})"
MongoDB
readings"V>:group:by:sid"="$.sid"into"{"tavg:"avg($.temp)"};""
Jaql
a:=:LOAD:'readings':AS:(sid:int,"temp:float);"
b:=:GROUP:a"BY:sid;"c"="FOREACH"b"GENERATE"AVG(temp);"DUMP"c;"
Pig
for"$r"in"collection("readings")"group:by:$r.sid"return"{"tavg:"avg($r.temp)"}"
JSONiq
• SQL++%covers%SQL,%N1QL%and%QL%research%prototypes%(e.g.,%UCI’s%ASTERIX)%
• Removing%the%current%“Tower%of%Babel”%effect
• Providing%formal%syntax%and%seman7cs%
SQL++%Removes%Superficial%Differences%
15%feature%matrices%(1Y11%dimensions%each)%classifying:
• Data%values • Schemas
• Access%and%construct%nested%data • Missing%informa7on
• Equality%seman7cs
• Ordering%seman7cs
• Aggrega7on • Joins • Set%operators • Extensibility
Surveyed%features%
• What%is%SQL++%?
• How%we%use%it%in%UCSD’s%FORWARD%? • SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons
• Introduc7on%to%the%SQL++%formal%specifica7on%paper%%
• Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop
• Methodology • Example%1:%data%model%(data%values) • Example%2:%query%language%(SELECT%clause) • Example%3:%seman7cs%(path) • Example%4:%seman7cs%(equality%func7on)
• The%N1QL%connec7on • The%SQL++%industrial%query%language%twin
Outline%
Methodology%
For%each%feature:
1. A%formal%defini7on%of%the%feature%in%SQL++
2. A%SQL++%example
3. A%feature%matrix%that%classifies%each%dimension%of%the%feature
4. A%discussion%of%the%results,%par7al%support%and%unexpected%behaviors
All%the%results%are%empirically%validated
Example:%Data%values%
1. SQL++%example:
{""
""location:"'Alpine',"
""readings:"["
""""{""
""""""time:"timestamp('2014;03;12T20:00:00'),"
""""""ozone:"0.035,"
""""""no2:"0.0050"
""""},"
""""{""
""""""time:"timestamp('2014;03;12T22:00:00'),"
""""""ozone:"'m',"
""""""co:"0.4"
""""}"]"
}"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
Example:%Data%values%
2. SQL++%BNF%for%values:
Example:%Data%values%
3. Feature%matrix:
Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives
Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes
4. Discussion%of%the%results:%
Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes
• ColumnYbyYcolumn%comparison • Par7al%support%(65k%scalar%elements) • Iden7fy%clusters%(who%supports%JSON?)
Example:%Data%values%
4. Discussion%of%the%results:%
Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes
• ColumnYbyYcolumn%comparison • Par7al%support%(65k%scalar%elements) • Iden7fy%clusters%(who%supports%JSON?)
Example:%Data%values%
4. Discussion%of%the%results:%
Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes
• ColumnYbyYcolumn%comparison • Par7al%support%(65k%scalar%elements) • Iden7fy%clusters%(who%supports%JSON?)
Example:%Data%values%
1. SQL++%example: • Projec7ng%nested%collec7ons:
• Projec7ng%nonYtuples: %
SELECT:ELEMENT:ozone"FROM"readings""
SELECT:TUPLE::::s.lat,"s.long,""""(SELECT"r.ozone""""FROM"readings"AS:r""""WHERE:r.location"="s.location)"FROM:"""sensors"AS:s""
"Posi7on%and%(nested)%ozone%readings%of%each%sensor?"
"Bag%of%all%the%(scalar)%ozone%readings?"
Example:%SELECT%clause%
Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No SQL++ Yes Yes
Example:%SELECT%clause%
3. Feature%matrix:
Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No SQL++ Yes Yes
• Not well supported features • 3 languages support them entirely
(same cluster as for data values)
4. Discussion%of%the%results:%
Example:%SELECT%clause%
Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No SQL++ Yes Yes
• Not well supported features • 3 languages support them entirely
(same cluster as for data values)
4. Discussion%of%the%results:%
Example:%SELECT%clause%
We%use%config¶meters%to%encompass%and%compare%various%seman7cs%of%a%feature:%
• Minimal%number%of%independent%dimensions
• 1%dimension%=%1%config%parameter%
• SQL++%formalism%parametrized%by%the%config%parameters
• Feature%matrix%classifies%the%values%of%each%config%parameter
Config%Parameters%
• Config%parameters%for%tuple%naviga7on:
@tuple_nav:{::::::absent:::::::::missing,""""""type_mismatch:"error,""}:("x.y")"
Example:%Paths%
• The%feature%matrix%classifies,%for%each%language,%the%value%of%each%config%parameter:
Missing Type mismatch
Hive Error Error Jaql Null Error# Pig Error Error CQL Error Error JSONiq Missing# Missing# MongoDB Missing Missing N1QL Missing Missing SQL Error Error AQL Null Error BigQuery Error Error MongoJDBC Missing# Missing# SQL++ @path @path
Example:%Paths%
• Config%parameters%for%equality:
Example:%Equality%Func3on%
@eq:{""""""complex::::::::::::"error,""""""type_mismatch::::::"false,""""""null_eq_null:::::::"null,""""""null_eq_value::::::"null,""""""missing_eq_missing:"missing,""""""missing_eq_value:::"missing,""""""missing_eq_null::::"missing""}:("x"="y")"
• The%feature%matrix%classifies,%for%each%language,%the%value%of%each%config%parameter:
Complex Type mismatch Null = Null Null = Value Missing =
Missing Missing =
Value Missing =
Null Hive (=, <=>) Err, Err Err, Err Null, True Null, False N/A N/A N/A Jaql Boolean Null Null Null N/A N/A N/A Pig Boolean partial Err Null Null N/A N/A N/A CQL Err Err Err False/Null N/A N/A N/A JSONiq (=, deep-equal()) Err, Boolean Err, False True, True False, False Missing, True Missing, False Missing, False MongoDB Boolean False True False True False False N1QL Boolean False Null Null Missing Missing Missing SQL N/A Err Null Null N/A N/A N/A AQL Err Err Null Null N/A N/A N/A BigQuery Err Err Null Null N/A N/A N/A MongoJDBC Boolean False True False N/A False False SQL++ @equal @equal @equal @equal @equal @equal @equal
• No%real%cluster • Some%languages%have%mul7ple%(incompa7ble)%equality%func7ons • Some%edge%cases%cannot%happen%due%to%other%limita7ons%(SQL%has%no%complex%values)
Example:%Equality%Func3on%
• The%feature%matrix%classifies,%for%each%language,%the%value%of%each%config%parameter:
Complex Type mismatch Null = Null Null = Value Missing =
Missing Missing =
Value Missing =
Null Hive (=, <=>) Err, Err Err, Err Null, True Null, False N/A N/A N/A Jaql Boolean Null Null Null N/A N/A N/A Pig Boolean partial Err Null Null N/A N/A N/A CQL Err Err Err False/Null N/A N/A N/A JSONiq (=, deep-equal()) Err, Boolean Err, False True, True False, False Missing, True Missing, False Missing, False MongoDB Boolean False True False True False False N1QL Boolean False Null Null Missing Missing Missing SQL N/A Err Null Null N/A N/A N/A AQL Err Err Null Null N/A N/A N/A BigQuery Err Err Null Null N/A N/A N/A MongoJDBC Boolean False True False N/A False False SQL++ @equal @equal @equal @equal @equal @equal @equal
• No%real%cluster • Some%languages%have%mul7ple%(incompa7ble)%equality%func7ons • Some%edge%cases%cannot%happen%due%to%other%limita7ons%(SQL%has%no%complex%values)
Example:%Equality%Func3on%
• As%a%database%user:% • Understand%the%seman7cs%of%a%(oqen%underspecified)%query%language%/%be%
aware%of%the%limita7on%of%a%database%
• As%a%designer/architect%of%a%database • Produce%formal%specifica7on%of%your%query%language • Align%seman7cs%with%SQL's%
• As%a%database%researcher • The%results%might%change,%but%the%survey%methodology%stays%
• As%a%designer/architect%of%database%middleware • Understand%what%capability%varia7ons%need%to%be%encapsulated%and%
simulated
How%to%use%this%survey?%
• The%marke7ng%clusters%do%not%correspond%to%real%capabili7es%
• Limited%capabili7es:%matrices%are%sparse%and%fragmented%(more%pressure%on%sourceYspecific%rewriters%and%distributor)
The%survey%shows:%
• What%is%SQL++%?
• How%we%use%it%in%UCSD’s%FORWARD%? • SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons
• Introduc7on%to%the%SQL++%formal%specifica7on%paper%
• Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop
• The%N1QL%connec7on • The%SQL++%industrial%query%language%twin
Outline%
• N1QL%is%by%far%the%closest%industrial%database%implementa7on%of%SQL++
• Alignment%in%principles • JSON%+%declara7ve
• The%distance%will%further%close%in%Release%4
N1QL:%the%Industrial%Twin%of%SQL++%
The%Future%is%Semi;Structured%and%Declara3ve%
• Scalability
• Flexibility
• Automa7on • Logical/physical%separa7on
• The%primary%opera7onal%and%the%secondary%analy7cs%applica7on%out%of%semistructured,%declara7ve%plazorms