Yannis papakonstantinou sql++ query language for semi-structured data

73
SQL++ by Kian Win Ong, Yannis Papakonstan7nou, Romain Vernoux Use of SQL++ in virtual database and live analy7cs systems by Yupeng Fu, Michalis Petropoulos, Gaurav Saxena, Alok Singh and others… with the support of Na7onal Science Founda7on, Google, Informa7ca you will have it all: JSON + SQL declara7ve analy7cs + top scalable performance SQL++ Query Language

description

Speaker: Yannis Papakonstantinou A lack of a unifying language across NoSQL, newSQL and Big Data technologies presents many problems to practitioners building next generation applications. So what are the key requirements behind a next generation query language for multi-structured data? SQL ++ introduces a unifying language integrating these new technologies while being fully backwards compatible with the massively adopted SQL. Get this 30-page, cutting edge research paper to learn about — SQL++ syntax and semantics — A data model and query language expressiveness benchmark across NoSQL, NewSQL and Hadoop — A detailed classification of N1QL and other Big Data languages across 15 feature categories

Transcript of Yannis papakonstantinou sql++ query language for semi-structured data

Page 1: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++%by%Kian%Win%Ong,%Yannis%Papakonstan7nou,%Romain%Vernoux

Use%of%SQL++%in%virtual%database%and%live%analy7cs%systems%by%Yupeng%Fu,%Michalis%Petropoulos,%Gaurav%Saxena,%Alok%Singh%and%others…%

with%the%support%of%Na7onal%Science%Founda7on,%Google,%Informa7ca%

you%will%have%it%all:% JSON%%

+%SQL%declara7ve%analy7cs%%+%top%scalable%performance

SQL++%Query%Language%

Page 2: Yannis papakonstantinou   sql++ query language for semi-structured data

• What%is%SQL++%?

•  How%we%use%it%in%UCSD’s%FORWARD%? •  SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons

•  Introduc7on%to%the%SQL++%formal%specifica7on%paper%

•  Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop

•  The%N1QL%connec7on •  The%SQL++%industrial%query%language%twin

Outline%

Page 3: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++%semiYstructured% data%model: superset%of%

JSON%and%SQL Declara3ve%%

query&language:%aggressively%backwards;compa3ble%with%SQL%

Page 4: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++%semiYstructured% data%model: superset%of%

JSON%and%SQL Declara3ve%%

query&language:%aggressively%backwards;compa3ble%with%SQL%

Formal&Seman/cs:&adjus3ng%tuple%calculus%to%the%richness%of%JSON%%

Automa/c&Query&Op/miza/on&&&Execu/on:&FORWARD%virtual%database%

query%plans%based%on%%SQL++%algebra%

Automa/c&Incremental&View&Maintenance&

Page 5: Yannis papakonstantinou   sql++ query language for semi-structured data

A%superset%of%SQL%and%JSON

•  JSON%+%bags%+%enriched%values%+%… •  SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…

SQL++%Data%Model%(also%JSON++%data%model)%

Page 6: Yannis papakonstantinou   sql++ query language for semi-structured data

A%superset%of%SQL%and%JSON

•  JSON%+%bags%+%enriched%values%+%… •  SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…

{""

""location:"'Alpine',"

""readings:"["

""""{""

""""""time:"timestamp('2014;03;12T20:00:00'),"

""""""ozone:"0.035,"

""""""no2:"0.0050"

""""},"

""""{""

""""""time:"timestamp('2014;03;12T22:00:00'),"

""""""ozone:"'m',"

""""""co:"0.4"

""""}"]"

}"

1"

2"

3"

4"

5"

6"

7"

8"

9"

10"

11"

12"

13"

14"

SQL++%Data%Model%(also%JSON++%data%model)%

Page 7: Yannis papakonstantinou   sql++ query language for semi-structured data

A%superset%of%SQL%and%JSON

•  JSON%+%bags%+%enriched%values%+%… •  SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…

{""

""location:"'Alpine',"

""readings:"["

""""{""

""""""time:"timestamp('2014;03;12T20:00:00'),"

""""""ozone:"0.035,"

""""""no2:"0.0050"

""""},"

""""{""

""""""time:"timestamp('2014;03;12T22:00:00'),"

""""""ozone:"'m',"

""""""co:"0.4"

""""}"]"

}"

1"

2"

3"

4"

5"

6"

7"

8"

9"

10"

11"

12"

13"

14"

•  Collec7on%nested%inside%a%tuple%

•  Heterogeneous%collec7on%elements%

•  Permissive%schema%or%schemaYless%

•  TopYlevel%value%may%not%be%a%collec7on

SQL++%Data%Model%(also%JSON++%data%model)%

Page 8: Yannis papakonstantinou   sql++ query language for semi-structured data

A%superset%of%SQL%and%JSON

•  JSON%+%bags%+%enriched%values%+%… •  SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…

{""

""location:"'Alpine',"

""readings:"["

""""{""

""""""time:"timestamp('2014;03;12T20:00:00'),"

""""""ozone:"0.035,"

""""""no2:"0.0050"

""""},"

""""{""

""""""time:"timestamp('2014;03;12T22:00:00'),"

""""""ozone:"'m',"

""""""co:"0.4"

""""}"]"

}"

1"

2"

3"

4"

5"

6"

7"

8"

9"

10"

11"

12"

13"

14"

•  Collec7on%nested%inside%a%tuple%

•  Heterogeneous%collec7on%elements%

•  Permissive%schema%or%schemaYless%

•  TopYlevel%value%may%not%be%a%collec7on

SQL++%Data%Model%(also%JSON++%data%model)%

Page 9: Yannis papakonstantinou   sql++ query language for semi-structured data

A%superset%of%SQL%and%JSON

•  JSON%+%bags%+%enriched%values%+%… •  SQL%+%arrays%+%nes7ng%+%heterogeneity%+%…

{""

""location:"'Alpine',"

""readings:"["

""""{""

""""""time:"timestamp('2014;03;12T20:00:00'),"

""""""ozone:"0.035,"

""""""no2:"0.0050"

""""},"

""""{""

""""""time:"timestamp('2014;03;12T22:00:00'),"

""""""ozone:"'m',"

""""""co:"0.4"

""""}"]"

}"

1"

2"

3"

4"

5"

6"

7"

8"

9"

10"

11"

12"

13"

14"

•  Collec7on%nested%inside%a%tuple%

•  Heterogeneous%collec7on%elements%

•  Permissive%schema%or%schemaYless%

•  TopYlevel%value%may%not%be%a%collec7on

SQL++%Data%Model%(also%JSON++%data%model)%

Page 10: Yannis papakonstantinou   sql++ query language for semi-structured data

1.  Extended%SQL%syntax/seman7cs%for%heterogeneity,%%complex%values •  Nes7ng%and%Unnes7ng%for%crea7ng/accessing%nested%collec7ons •  Element%variables%(as%opposed%to%tuple%variables) •  Mul7ple%forms%of%naviga7on

•  corresponding%to%the%mul7ple%complex%types%

2.  Queries%that%input/output%any%JSON%type%(not%only%collec7ons%of%tuples)%

3.  Virtual%database%(unifying%database)%features •  Establish%universal%access%to%SQL%and%NoSQL%databases% •  Are%important%to%survey%of%SQL%and%NoSQL%databases

The%SQL++%Extensions%to%SQL%

Page 11: Yannis papakonstantinou   sql++ query language for semi-structured data

• What%is%SQL++%?

•  How%we%use%it%in%UCSD’s%FORWARD%? •  SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons

•  Introduc7on%to%the%SQL++%formal%specifica7on%paper%

•  Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop

•  The%N1QL%connec7on •  The%SQL++%industrial%query%language%twin

Outline%

Page 12: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

Client%

FORWARD&Virtual&Database%

SQL++%based%Virtual%Database%

Page 13: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

FORWARD&Virtual&Database%

SQL++%based%Virtual%Database%

Page 14: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

SQL++&Virtual&Views&of&Sources&

FORWARD&Virtual&Database%

SQL++%based%Virtual%Database%

SQL Wrapper

NewSQL Wrapper

NoSQL Wrapper

SQLYonYHadoop Wrapper

Java%Objects%Wrapper

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

Page 15: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

SQL++&Virtual&Views&of&Sources&

Federated%SQL++%Queries

FORWARD&Virtual&Database%

SQL++%based%Virtual%Database%

SQL Wrapper

NewSQL Wrapper

NoSQL Wrapper

SQLYonYHadoop Wrapper

Java%Objects%Wrapper

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

Na7ve queries

Page 16: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

SQL++&Virtual&Views&of&Sources&

Federated%SQL++%Queries SQL++%Results

FORWARD&Virtual&Database%

SQL++%based%Virtual%Database%

SQL Wrapper

NewSQL Wrapper

NoSQL Wrapper

SQLYonYHadoop Wrapper

Java%Objects%Wrapper

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

Na7ve queries

Na7ve results

Page 17: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

SQL++&Virtual&Views&of&Sources&

Federated%SQL++%Queries SQL++%Results

FORWARD&Virtual&Database%

SQL++%based%Virtual%Database%

SQL Wrapper

NewSQL Wrapper

NoSQL Wrapper

SQLYonYHadoop Wrapper

Java%Objects%Wrapper

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

Na7ve queries

Na7ve results

SQL++%&%FORWARD%predecessor: • %OEM%semiYstructured%model • %Virtual%database%work%since%90s%(>7000%cita7ons)% =>%Enosys%XML%virtual%database =>%Sold%via%BEA%Aqualogic%(2001)

Page 18: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

SQL++%Queries SQL++%Results

FORWARD&Integra/on&Database%

SQL++%Virtual%and/or%Materialized%Views%

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

Page 19: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

SQL Database

NewSQL Database

NoSQL Database

SQLYonYHadoop Database

Client%

Java% InYMemory%Objects

SQL++%Queries SQL++%Results

FORWARD&Integra/on&Database%

SQL++%Virtual%and/or%Materialized%Views%

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++ Virtual View

SQL++&Integrated&Views&

SQL++ Integrated

View

SQL++ Integrated

View

SQL++ Added%Value

View

SQL++%View%Defini7ons

Page 20: Yannis papakonstantinou   sql++ query language for semi-structured data

"measurements:"{{"

"""{"sid:"2,"temp:"[70.1,"70.2]"},""

"""{"sid:"1,"temp:"[71.0]"}""

"}}"

BI&Tool&/&Report&Writer&

Couchbase&

FORWARD&Virtual&Database%

Use%Case%1:%%Hide%the%++%source%features%behind%SQL%views%

Page 21: Yannis papakonstantinou   sql++ query language for semi-structured data

"measurements:"{{"

"""{"sid:"2,"temp:"[70.1,"70.2]"},""

"""{"sid:"1,"temp:"[71.0]"}""

"}}"

BI&Tool&/&Report&Writer&

Couchbase&

""""

FORWARD&Virtual&Database%

or&its&SQL++&equivalent&

sid" temp"

2" 70.1"

2" 70.2"

1" 71.0"

measurements:"{{""

"{sid:"2,"temp:"70.1},"

"{sid:"2,"temp:"70.2},"

"{sid:"1,"temp:"71.0}""

}}"

Use%Case%1:%%Hide%the%++%source%features%behind%SQL%views%

SQL&View&

measurements:"

Page 22: Yannis papakonstantinou   sql++ query language for semi-structured data

"measurements:"{{"

"""{"sid:"2,"temp:"[70.1,"70.2]"},""

"""{"sid:"1,"temp:"[71.0]"}""

"}}"

BI&Tool&/&Report&Writer&

Couchbase&

""""

FORWARD&Virtual&Database%

or&its&SQL++&equivalent&

SQL%query SQL%result

sid" temp"

2" 70.1"

2" 70.2"

1" 71.0"

Couchbase%query Couchbase%results

measurements:"{{""

"{sid:"2,"temp:"70.1},"

"{sid:"2,"temp:"70.2},"

"{sid:"1,"temp:"71.0}""

}}"

Use%Case%1:%%Hide%the%++%source%features%behind%SQL%views%

SQL&View&

measurements:"

Page 23: Yannis papakonstantinou   sql++ query language for semi-structured data

measurements:"{{"

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"

}}"

MongoDB&

Use%Case%2:%%Capabili3es%and%Seman3cs;Aware%Pushdown%

MongoDB&Virtual&Database&

Virtual%View (Iden7cal)

MongoDB&Wrapper&

"Sensors%that%recorded%a%temperature%below%50"

Page 24: Yannis papakonstantinou   sql++ query language for semi-structured data

measurements:"{{"

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"

}}"

MongoDB&

SQL++%query SQL++%result

Use%Case%2:%%Capabili3es%and%Seman3cs;Aware%Pushdown%

MongoDB&Virtual&Database&

Virtual%View (Iden7cal)

MongoDB&Wrapper&

SELECT"DISTINCT"m.sid"FROM"""measurements"AS"m"WHERE""m.temp"<"50"

"Sensors%that%recorded%a%temperature%below%50"

SQL++%seman7cs%for% m.temp"<"50"

Page 25: Yannis papakonstantinou   sql++ query language for semi-structured data

measurements:"{{"

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"

}}"

MongoDB&

SQL++%query SQL++%result

MongoDB%query MongoDB%results

Use%Case%2:%%Capabili3es%and%Seman3cs;Aware%Pushdown%

MongoDB&Virtual&Database&

Virtual%View (Iden7cal)

MongoDB&Wrapper&

SELECT"DISTINCT"m.sid"FROM"""measurements"AS"m"WHERE""m.temp"<"50"

"Sensors%that%recorded%a%temperature%below%50"

SQL++%seman7cs%for% m.temp"<"50"

..."

{"$match:"{temp:"{$lt:""50}}},"{"$match:"{temp:"{$not:"null}}}"..."

MongoDB%seman7cs%for temp"$lt"50"

Page 26: Yannis papakonstantinou   sql++ query language for semi-structured data

•  Sources%like%SQL%and%MongoDB%cannot%execute%the%en7rety%of%SQL++

•  How%to%efficiently%push%down%computa7on?

•  How%to%simulate%incompa7ble%seman7cs/missing%features?

Issues%automa7cally%handled%by%the%query%processor.

Plenty%of%query%rewri7ng%problems,%including%novel%ones%on%semiYstructured%operators

Push;Down%Challenges:%Limited%capabili3es%&%seman3c%varia3ons%

Page 27: Yannis papakonstantinou   sql++ query language for semi-structured data

•  Seman7cs%of%"lessYthan"%comparison%are%different%across%sources:%

%<sql%,%<mongodb%,%etc.

•  Config%Parameters%to%capture%these%varia7ons

SQL++%captures%Seman3cs%Varia3ons%

@lt:{:MongoDB:}:("x"<"y")"

@lt:{:""""complex:::::::"boolean,"""""type_mismatch:"false,"""""null_lt_null::"false,"""""null_lt_value:"boolean,"""""..."

}"("x"<"y")"

Page 28: Yannis papakonstantinou   sql++ query language for semi-structured data

In%NoSQL,%NewSQL,%SQLYonYHadoop%varia7on%of%seman7cs%for:

•  Paths

•  Equality

•  Comparisons

And%all%the%operators%that%use%them:

•  Selec7ons

•  Grouping

•  Ordering

•  Set%opera7ons

Each%of%these%features%has%a%set%of%config%parameters%in%SQL++

Seman3cs%Varia3ons%

Page 29: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL++&Query&Processor&

Use%Case%3:%Integrated%Query%Processing%

Page 30: Yannis papakonstantinou   sql++ query language for semi-structured data

"

sensors:""

"

"

PostgreSQL&

"measurements:"["

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"]"

MongoDB&

id: lat: lng:1" 32.8" ;117.1"

2" 32.7" ;117.2"

SQL++&Query&Processor&

Use%Case%3:%Integrated%Query%Processing%

Page 31: Yannis papakonstantinou   sql++ query language for semi-structured data

"

sensors:""

"

"

PostgreSQL&

"measurements:"["

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"]"

MongoDB&

id: lat: lng:1" 32.8" ;117.1"

2" 32.7" ;117.2"

SQL++&Query&Processor&

sensors:"{{""

"{id:1,"lat:32.8,"lng:;117.1},"

"{id:2,"lat:32.7,"lng:;117.2}"}}"

SQL&Wrapper&

measurements:"[""

"{"sid:"2,"temp:"70.1"},"

"{"sid:"2,"temp:"49.2"},"

"{"sid:"1,"temp:"null"}"]"

MongoDB&Wrapper&

SQL&Virtual&Database& MongoDB&Virtual&Database&

Use%Case%3:%Integrated%Query%Processing%

Page 32: Yannis papakonstantinou   sql++ query language for semi-structured data

"

sensors:""

"

"

PostgreSQL&

"measurements:"["

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"]"

MongoDB&

id: lat: lng:1" 32.8" ;117.1"

2" 32.7" ;117.2"

SQL++&Query&Processor&

sensors:"{{""

"{id:1,"lat:32.8,"lng:;117.1},"

"{id:2,"lat:32.7,"lng:;117.2}"}}"

SQL&Wrapper&

measurements:"[""

"{"sid:"2,"temp:"70.1"},"

"{"sid:"2,"temp:"49.2"},"

"{"sid:"1,"temp:"null"}"]"

MongoDB&Wrapper&

SQL&Virtual&Database& MongoDB&Virtual&Database&

SQL++

"Sensors%in%a%given%area%that%recorded%a%low%temperature?"

Use%Case%3:%Integrated%Query%Processing%

Page 33: Yannis papakonstantinou   sql++ query language for semi-structured data

"

sensors:""

"

"

PostgreSQL&

"measurements:"["

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"]"

MongoDB&

id: lat: lng:1" 32.8" ;117.1"

2" 32.7" ;117.2"

SQL++&Query&Processor&

sensors:"{{""

"{id:1,"lat:32.8,"lng:;117.1},"

"{id:2,"lat:32.7,"lng:;117.2}"}}"

SQL&Wrapper&

measurements:"[""

"{"sid:"2,"temp:"70.1"},"

"{"sid:"2,"temp:"49.2"},"

"{"sid:"1,"temp:"null"}"]"

MongoDB&Wrapper&

SQL&Virtual&Database& MongoDB&Virtual&Database&

SQL++

SQL++ SELECT"s.lat,"s.lng,"m.temp""FROM"""sensors"AS"s""JOIN"""measurements"AS"m"":::::::ON"s.id"="m.sid"WHERE""(s.lat">"32.6"AND"s.lat"<"32.9"":::::::AND"s.lng">";117.0"AND"s.lng"<";117.3)""""""""AND:m.temp"<"50"

"Sensors%in%a%given%area%that%recorded%a%low%temperature?"

Use%Case%3:%Integrated%Query%Processing%

Page 34: Yannis papakonstantinou   sql++ query language for semi-structured data

"

sensors:""

"

"

PostgreSQL&

"measurements:"["

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"]"

MongoDB&

id: lat: lng:1" 32.8" ;117.1"

2" 32.7" ;117.2"

SQL++&Query&Processor&

sensors:"{{""

"{id:1,"lat:32.8,"lng:;117.1},"

"{id:2,"lat:32.7,"lng:;117.2}"}}"

SQL&Wrapper&

measurements:"[""

"{"sid:"2,"temp:"70.1"},"

"{"sid:"2,"temp:"49.2"},"

"{"sid:"1,"temp:"null"}"]"

MongoDB&Wrapper&

SQL&Virtual&Database& MongoDB&Virtual&Database&

SQL++ SQL++

SQL++ SELECT"s.lat,"s.lng,"m.temp""FROM"""sensors"AS"s""JOIN"""measurements"AS"m"":::::::ON"s.id"="m.sid"WHERE""(s.lat">"32.6"AND"s.lat"<"32.9"":::::::AND"s.lng">";117.0"AND"s.lng"<";117.3)""""""""AND:m.temp"<"50"

"Sensors%in%a%given%area%that%recorded%a%low%temperature?"

SQL MongoDB

Use%Case%3:%Integrated%Query%Processing%

Page 35: Yannis papakonstantinou   sql++ query language for semi-structured data

"

sensors:""

"

"

PostgreSQL&

"measurements:"["

"""{"sid:"2,"temp:"70.1"},"

"""{"sid:"2,"temp:"49.2"},""

"""{"sid:"1,"temp:"null"}"]"

MongoDB&

id: lat: lng:1" 32.8" ;117.1"

2" 32.7" ;117.2"

SQL++&Query&Processor&

sensors:"{{""

"{id:1,"lat:32.8,"lng:;117.1},"

"{id:2,"lat:32.7,"lng:;117.2}"}}"

SQL&Wrapper&

measurements:"[""

"{"sid:"2,"temp:"70.1"},"

"{"sid:"2,"temp:"49.2"},"

"{"sid:"1,"temp:"null"}"]"

MongoDB&Wrapper&

SQL&Virtual&Database& MongoDB&Virtual&Database&

SQL++ SQL++

SQL++ SELECT"s.lat,"s.lng,"m.temp""FROM"""sensors"AS"s""JOIN"""measurements"AS"m"":::::::ON"s.id"="m.sid"WHERE""(s.lat">"32.6"AND"s.lat"<"32.9"":::::::AND"s.lng">";117.0"AND"s.lng"<";117.3)""""""""AND:m.temp"<"50"

"Sensors%in%a%given%area%that%recorded%a%low%temperature?"

SQL MongoDB

db.measurements.aggregate("""{"$match:"{temp:"{$lt:"50}}"},""{"$match:"{temp:"{$not:"null}}"}")"

Use%Case%3:%Integrated%Query%Processing%

Page 36: Yannis papakonstantinou   sql++ query language for semi-structured data

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

•  The%Incremental%View%Maintenance%func7onality: •  SQL++%(Materialized)%View%Defini7on%%%%%%

%

Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"

DB Before

View Before

Page 37: Yannis papakonstantinou   sql++ query language for semi-structured data

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

•  The%Incremental%View%Maintenance%func7onality: •  SQL++%(Materialized)%View%Defini7on%%%%%%

•  Stream%of%inserts,%deletes,%updates%on%base%data%

Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"

DB Before

DB Aqer

View Before

Stream

Page 38: Yannis papakonstantinou   sql++ query language for semi-structured data

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

•  The%Incremental%View%Maintenance%func7onality: •  SQL++%(Materialized)%View%Defini7on%%%%%%

•  Stream%of%inserts,%deletes,%updates%on%base%data%

Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"

DB Before

DB Aqer

View Before

View Aqer

Stream

Page 39: Yannis papakonstantinou   sql++ query language for semi-structured data

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

•  The%Incremental%View%Maintenance%func7onality: •  SQL++%(Materialized)%View%Defini7on%%%%%%

•  Stream%of%inserts,%deletes,%updates%on%base%data%

•  The%Incremental%View%Maintenance%module% •  Automa7cally%and%efficiently%updates%the%materialized%view%to%reflect%the%stream%of%changes%

Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"

DB Before

DB Aqer

View Before

View Aqer

Stream

IVM Stream

Page 40: Yannis papakonstantinou   sql++ query language for semi-structured data

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

•  The%Incremental%View%Maintenance%func7onality: •  SQL++%(Materialized)%View%Defini7on%%%%%%

•  Stream%of%inserts,%deletes,%updates%on%base%data%

•  The%Incremental%View%Maintenance%module% •  Automa7cally%and%efficiently%updates%the%materialized%view%to%reflect%the%stream%of%changes%

•  SQL++%can%also%enable%automa7c%Incremental%View%Maintenance!% •  With%aten7on%to%replica7on%of%data%in%views •  Opportuni7es%by%keys

Eg,%Couchbase%has%a%JSON%web%log%showing% %%%%{user,"list"of"displayed"products"}"and%produce%materialized%views %%%%{product"category,"count,"""""[{product,"count}]}"

DB Before

DB Aqer

View Before

View Aqer

Stream

IVM Stream

Page 41: Yannis papakonstantinou   sql++ query language for semi-structured data

Custom%dashboards,%interac7ve%pages%&%apps

•  The%data%models%of%visualiza7on%components%(e.g.%Google%Maps)%can%be%nicely%captured%with%JSON%models

•  The%pages%are%SQL++%(JSON)%views! •  Mashups%of%the%components%views

•  SQL++%feeds%and%incrementally%updates%the%page%views

%Use%case

•  From%data%to%visualiza7on%with%just%SQL++%&%markup •  Ajax/Javascript%visuals%with%no%Ajax/Javascript%mess •  How%to%easily%connect%to%today’s%JS%libraries

•  Custom%Ajax%visualiza7ons%&%interfaces%for%IT%personnel

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

Page 42: Yannis papakonstantinou   sql++ query language for semi-structured data

(part%of)%the%Google%Map%model%

<%:unit:google.maps.Maps:%>:""{"

""""markers:"["{"

""""""position:"{""

""""""""latitude":"number,""""""""""longitude:"number:""""""}"

""""""..."

""""}"]""

""}"

<%:end:unit:%>:

FORWARD:%SQL++%Incremental%View%Maintenance%and%Applica3on%Visualiza3on%layer%%

Page 43: Yannis papakonstantinou   sql++ query language for semi-structured data

• What%is%SQL++%?

•  How%we%use%it%in%UCSD’s%FORWARD%? •  SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons

•  Introduc7on%to%the%SQL++%formal%specifica7on%paper%

•  Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop

•  The%N1QL%connec7on •  The%SQL++%industrial%query%language%twin

Outline%

Page 44: Yannis papakonstantinou   sql++ query language for semi-structured data

FROM"readings"AS"r"

SELECT:r"AS"co"

WHERE:r"<"1.0:

B!=!{{!⟨!r!:!1.3!⟩, !!!!!!!!!!!!!⟨!r!:!0.7!⟩, !!!!!!!!!!!!!⟨!r!:!0.3!⟩, !!!!!!!!!!!!!⟨!r!:!0.8!⟩!!}}

B'#=!{{#⟨!r!:!0.7!⟩, !!!!!!!!!!!!!!⟨!r!:!0.3!⟩, ##############⟨!r!:!0.8!⟩!}} ["

"{"co:"0.8"},"

"{"co:"0.7"}"

]"

ℾ!=!⟨! "readings!:! ""[""

"""1.3,"

"""0.7,"

"""0.3,"

"""0.8"

""]!"⟩

Result&Environment& Query&⊢ →

ORDER:BY:r"DESC::

LIMIT"2:

B''!=![!⟨!r!:!0.8!⟩, !!!!!!!!!!!!!⟨!r!:!0.7!⟩, !!!!!!!!!!!!!⟨!r!:!0.3!⟩!]

B'''#=![!⟨!r!:!0.8!⟩ !!!!!!!!!!!!!!⟨!r!:!0.7!⟩!]

How%to%read%the%SQL++%syntax%and%seman3cs%

Formal%specifica7on%paper:

htp://arxiv.org/abs/1405.3631

•  Data%model%(Sec7ons%3,%3.1)

•  Element%variables%(Sec7on%4.1)

•  SELECTYFROMYWHERE%queries%as%element%variable%generators

Page 45: Yannis papakonstantinou   sql++ query language for semi-structured data

FROM"""sensors"AS"s"{{"

"{"sensor"":"1,"

"""readings:"{{"{co:0.4},"{co:0.2}"}}"

"},"

"

"{"sensor"":"2,"

"""readings:"{{"{co:0.3}"}}"

"},"

}}"

Result&

b1!=!⟨!s":"{sensor:"1}!⟩ b2!=!⟨!s":"{sensor:"2}!⟩""

Environment&

ℾ0!=!⟨! "sensors!:!{{""""{sensor:"1},"

""{sensor:"2}"

"}}!, !!logs:!{{"""{sensor:"1,"co:"0.4},"

""{sensor:"1,"co:"0.2},"

""{sensor:"2,"co:"0.3},"

"}}"

Query&

SELECT:TUPLE:s.sensor"AS"sensor,""""""""""""""("SELECT"TUPLE"l.co"AS"co""""""""""""""""FROM"""logs"AS"l"""""""""""""""""""""""""WHERE""l.sensor"="s.sensor""""""""""""""")"AS"readings"

⊢ →

FROM"""logs"AS"l"

WHERE::l.sensor"="s.sensor"

ℾ1!=!⟨! "s!:!{sensor:"1}!, "sensors!:!{{""""{sensor:"1},"

""{sensor:"2}"

"}}!, !!logs:!{{"""{sensor:"1,"co:"0.4},"

""{sensor:"1,"co:"0.2},"

""{sensor:"2,"co:"0.3},"

"}}"

b'1!=!⟨!l":"{sensor:"1,"co:"0.4}!⟩ b'2!=!⟨!l":"{sensor:"1,"co:"0.2}!⟩""b'3!=!⟨!l":"{sensor:"2,"co:"0.3}!⟩""

SELECT:TUPLE"l.co"AS"co"

b''1!=!⟨!l":"{sensor:"1,"co:"0.4}!⟩ b''2!=!⟨!l":"{sensor:"1,"co:"0.2}!⟩

{{"{co:0.4},"{co:0.2}"}}"

Result&Environment& Query&⊢ →

Page 46: Yannis papakonstantinou   sql++ query language for semi-structured data

• What%is%SQL++%?

•  How%we%use%it%in%UCSD’s%FORWARD%? •  SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons

•  Introduc7on%to%the%SQL++%formal%specifica7on%paper%

•  Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop

•  The%N1QL%connec7on •  The%SQL++%industrial%query%language%twin

Outline%

Page 47: Yannis papakonstantinou   sql++ query language for semi-structured data

SQL-on-Hadoop

PIG

Jaql

CQL

N1QL

AQL

MongoDB driver

SQLYonYHadoop

SQL%&%NewSQL

NoSQL

Others

Surveyed%Databases%

Page 48: Yannis papakonstantinou   sql++ query language for semi-structured data

•  SQL++%covers%SQL,%N1QL%and%QL%research%prototypes%(e.g.,%UCI’s%ASTERIX)%

•  Removing%the%current%“Tower%of%Babel”%effect

•  Providing%formal%syntax%and%seman7cs%

SQL++%Removes%Superficial%Differences%

Page 49: Yannis papakonstantinou   sql++ query language for semi-structured data

SELECT:AVG(temp)"AS"tavg""FROM"readings""GROUP:BY"sid"

SQL

db.readings.aggregate("""{$group:"{_id:""$sid",""""tavg:"{$avg:"$temp"}}})"

MongoDB

readings"V>:group:by:sid"="$.sid"into"{"tavg:"avg($.temp)"};""

Jaql

a:=:LOAD:'readings':AS:(sid:int,"temp:float);"

b:=:GROUP:a"BY:sid;"c"="FOREACH"b"GENERATE"AVG(temp);"DUMP"c;"

Pig

for"$r"in"collection("readings")"group:by:$r.sid"return"{"tavg:"avg($r.temp)"}"

JSONiq

•  SQL++%covers%SQL,%N1QL%and%QL%research%prototypes%(e.g.,%UCI’s%ASTERIX)%

•  Removing%the%current%“Tower%of%Babel”%effect

•  Providing%formal%syntax%and%seman7cs%

SQL++%Removes%Superficial%Differences%

Page 50: Yannis papakonstantinou   sql++ query language for semi-structured data

15%feature%matrices%(1Y11%dimensions%each)%classifying:

•  Data%values •  Schemas

•  Access%and%construct%nested%data •  Missing%informa7on

•  Equality%seman7cs

•  Ordering%seman7cs

•  Aggrega7on •  Joins •  Set%operators •  Extensibility

Surveyed%features%

Page 51: Yannis papakonstantinou   sql++ query language for semi-structured data

• What%is%SQL++%?

•  How%we%use%it%in%UCSD’s%FORWARD%? •  SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons

•  Introduc7on%to%the%SQL++%formal%specifica7on%paper%%

•  Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop

•  Methodology •  Example%1:%data%model%(data%values) •  Example%2:%query%language%(SELECT%clause) •  Example%3:%seman7cs%(path) •  Example%4:%seman7cs%(equality%func7on)

•  The%N1QL%connec7on •  The%SQL++%industrial%query%language%twin

Outline%

Page 52: Yannis papakonstantinou   sql++ query language for semi-structured data

Methodology%

For%each%feature:

1.  A%formal%defini7on%of%the%feature%in%SQL++

2.  A%SQL++%example

3.  A%feature%matrix%that%classifies%each%dimension%of%the%feature

4.  A%discussion%of%the%results,%par7al%support%and%unexpected%behaviors

All%the%results%are%empirically%validated

Page 53: Yannis papakonstantinou   sql++ query language for semi-structured data

Example:%Data%values%

1.  SQL++%example:

{""

""location:"'Alpine',"

""readings:"["

""""{""

""""""time:"timestamp('2014;03;12T20:00:00'),"

""""""ozone:"0.035,"

""""""no2:"0.0050"

""""},"

""""{""

""""""time:"timestamp('2014;03;12T22:00:00'),"

""""""ozone:"'m',"

""""""co:"0.4"

""""}"]"

}"

1"

2"

3"

4"

5"

6"

7"

8"

9"

10"

11"

12"

13"

14"

Page 54: Yannis papakonstantinou   sql++ query language for semi-structured data

Example:%Data%values%

2.  SQL++%BNF%for%values:

Page 55: Yannis papakonstantinou   sql++ query language for semi-structured data

Example:%Data%values%

3.  Feature%matrix:

Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives

Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes

Page 56: Yannis papakonstantinou   sql++ query language for semi-structured data

4.  Discussion%of%the%results:%

Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes

•  ColumnYbyYcolumn%comparison •  Par7al%support%(65k%scalar%elements) •  Iden7fy%clusters%(who%supports%JSON?)

Example:%Data%values%

Page 57: Yannis papakonstantinou   sql++ query language for semi-structured data

4.  Discussion%of%the%results:%

Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes

•  ColumnYbyYcolumn%comparison •  Par7al%support%(65k%scalar%elements) •  Iden7fy%clusters%(who%supports%JSON?)

Example:%Data%values%

Page 58: Yannis papakonstantinou   sql++ query language for semi-structured data

4.  Discussion%of%the%results:%

Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Hive Bag of tuples No Yes No No Partial Yes Yes Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes SQL++ Any Value Yes Yes Yes Partial Yes Yes Yes

•  ColumnYbyYcolumn%comparison •  Par7al%support%(65k%scalar%elements) •  Iden7fy%clusters%(who%supports%JSON?)

Example:%Data%values%

Page 59: Yannis papakonstantinou   sql++ query language for semi-structured data

1.  SQL++%example: •  Projec7ng%nested%collec7ons:

•  Projec7ng%nonYtuples: %

SELECT:ELEMENT:ozone"FROM"readings""

SELECT:TUPLE::::s.lat,"s.long,""""(SELECT"r.ozone""""FROM"readings"AS:r""""WHERE:r.location"="s.location)"FROM:"""sensors"AS:s""

"Posi7on%and%(nested)%ozone%readings%of%each%sensor?"

"Bag%of%all%the%(scalar)%ozone%readings?"

Example:%SELECT%clause%

Page 60: Yannis papakonstantinou   sql++ query language for semi-structured data

Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No SQL++ Yes Yes

Example:%SELECT%clause%

3.  Feature%matrix:

Page 61: Yannis papakonstantinou   sql++ query language for semi-structured data

Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No SQL++ Yes Yes

•  Not well supported features •  3 languages support them entirely

(same cluster as for data values)

4.  Discussion%of%the%results:%

Example:%SELECT%clause%

Page 62: Yannis papakonstantinou   sql++ query language for semi-structured data

Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No SQL++ Yes Yes

•  Not well supported features •  3 languages support them entirely

(same cluster as for data values)

4.  Discussion%of%the%results:%

Example:%SELECT%clause%

Page 63: Yannis papakonstantinou   sql++ query language for semi-structured data

We%use%config&parameters%to%encompass%and%compare%various%seman7cs%of%a%feature:%

•  Minimal%number%of%independent%dimensions

•  1%dimension%=%1%config%parameter%

•  SQL++%formalism%parametrized%by%the%config%parameters

•  Feature%matrix%classifies%the%values%of%each%config%parameter

Config%Parameters%

Page 64: Yannis papakonstantinou   sql++ query language for semi-structured data

• Config%parameters%for%tuple%naviga7on:

@tuple_nav:{::::::absent:::::::::missing,""""""type_mismatch:"error,""}:("x.y")"

Example:%Paths%

Page 65: Yannis papakonstantinou   sql++ query language for semi-structured data

•  The%feature%matrix%classifies,%for%each%language,%the%value%of%each%config%parameter:

Missing Type mismatch

Hive Error Error Jaql Null Error# Pig Error Error CQL Error Error JSONiq Missing# Missing# MongoDB Missing Missing N1QL Missing Missing SQL Error Error AQL Null Error BigQuery Error Error MongoJDBC Missing# Missing# SQL++ @path @path

Example:%Paths%

Page 66: Yannis papakonstantinou   sql++ query language for semi-structured data

• Config%parameters%for%equality:

Example:%Equality%Func3on%

@eq:{""""""complex::::::::::::"error,""""""type_mismatch::::::"false,""""""null_eq_null:::::::"null,""""""null_eq_value::::::"null,""""""missing_eq_missing:"missing,""""""missing_eq_value:::"missing,""""""missing_eq_null::::"missing""}:("x"="y")"

Page 67: Yannis papakonstantinou   sql++ query language for semi-structured data

•  The%feature%matrix%classifies,%for%each%language,%the%value%of%each%config%parameter:

Complex Type mismatch Null = Null Null = Value Missing =

Missing Missing =

Value Missing =

Null Hive (=, <=>) Err, Err Err, Err Null, True Null, False N/A N/A N/A Jaql Boolean Null Null Null N/A N/A N/A Pig Boolean partial Err Null Null N/A N/A N/A CQL Err Err Err False/Null N/A N/A N/A JSONiq (=, deep-equal()) Err, Boolean Err, False True, True False, False Missing, True Missing, False Missing, False MongoDB Boolean False True False True False False N1QL Boolean False Null Null Missing Missing Missing SQL N/A Err Null Null N/A N/A N/A AQL Err Err Null Null N/A N/A N/A BigQuery Err Err Null Null N/A N/A N/A MongoJDBC Boolean False True False N/A False False SQL++ @equal @equal @equal @equal @equal @equal @equal

•  No%real%cluster •  Some%languages%have%mul7ple%(incompa7ble)%equality%func7ons •  Some%edge%cases%cannot%happen%due%to%other%limita7ons%(SQL%has%no%complex%values)

Example:%Equality%Func3on%

Page 68: Yannis papakonstantinou   sql++ query language for semi-structured data

•  The%feature%matrix%classifies,%for%each%language,%the%value%of%each%config%parameter:

Complex Type mismatch Null = Null Null = Value Missing =

Missing Missing =

Value Missing =

Null Hive (=, <=>) Err, Err Err, Err Null, True Null, False N/A N/A N/A Jaql Boolean Null Null Null N/A N/A N/A Pig Boolean partial Err Null Null N/A N/A N/A CQL Err Err Err False/Null N/A N/A N/A JSONiq (=, deep-equal()) Err, Boolean Err, False True, True False, False Missing, True Missing, False Missing, False MongoDB Boolean False True False True False False N1QL Boolean False Null Null Missing Missing Missing SQL N/A Err Null Null N/A N/A N/A AQL Err Err Null Null N/A N/A N/A BigQuery Err Err Null Null N/A N/A N/A MongoJDBC Boolean False True False N/A False False SQL++ @equal @equal @equal @equal @equal @equal @equal

•  No%real%cluster •  Some%languages%have%mul7ple%(incompa7ble)%equality%func7ons •  Some%edge%cases%cannot%happen%due%to%other%limita7ons%(SQL%has%no%complex%values)

Example:%Equality%Func3on%

Page 69: Yannis papakonstantinou   sql++ query language for semi-structured data

•  As%a%database%user:% •  Understand%the%seman7cs%of%a%(oqen%underspecified)%query%language%/%be%

aware%of%the%limita7on%of%a%database%

•  As%a%designer/architect%of%a%database •  Produce%formal%specifica7on%of%your%query%language •  Align%seman7cs%with%SQL's%

•  As%a%database%researcher •  The%results%might%change,%but%the%survey%methodology%stays%

•  As%a%designer/architect%of%database%middleware •  Understand%what%capability%varia7ons%need%to%be%encapsulated%and%

simulated

How%to%use%this%survey?%

Page 70: Yannis papakonstantinou   sql++ query language for semi-structured data

•  The%marke7ng%clusters%do%not%correspond%to%real%capabili7es%

•  Limited%capabili7es:%matrices%are%sparse%and%fragmented%(more%pressure%on%sourceYspecific%rewriters%and%distributor)

The%survey%shows:%

Page 71: Yannis papakonstantinou   sql++ query language for semi-structured data

• What%is%SQL++%?

•  How%we%use%it%in%UCSD’s%FORWARD%? •  SQL++%uses%in%integra7on%and%(live)%analy7cs%applica7ons

•  Introduc7on%to%the%SQL++%formal%specifica7on%paper%

•  Introduc7on%to%the%formal%survey%of%NoSQL,%NewSQL%and%SQLYonYHadoop

•  The%N1QL%connec7on •  The%SQL++%industrial%query%language%twin

Outline%

Page 72: Yannis papakonstantinou   sql++ query language for semi-structured data

•  N1QL%is%by%far%the%closest%industrial%database%implementa7on%of%SQL++

•  Alignment%in%principles •  JSON%+%declara7ve

•  The%distance%will%further%close%in%Release%4

N1QL:%the%Industrial%Twin%of%SQL++%

Page 73: Yannis papakonstantinou   sql++ query language for semi-structured data

The%Future%is%Semi;Structured%and%Declara3ve%

•  Scalability

•  Flexibility

•  Automa7on •  Logical/physical%separa7on

•  The%primary%opera7onal%and%the%secondary%analy7cs%applica7on%out%of%semistructured,%declara7ve%plazorms