M.Sc. Seminar - Keren Lenz Supervisor - Dr. Yossi Gil July 1 st 2007
description
Transcript of M.Sc. Seminar - Keren Lenz Supervisor - Dr. Yossi Gil July 1 st 2007
M.Sc. Seminar - Keren LenzM.Sc. Seminar - Keren Lenz
Supervisor - Dr. Yossi GilSupervisor - Dr. Yossi Gil
July 1July 1stst 2007 2007
Simple and Safe SQL Queries with C++ Templates
ARARATARARAT- -
2/21
Motivation
•A relational DB with an Employee table–Field 1: EMPNUM (integer)–Field 2: DEPT (smallint)–Field 3: FIRST_N (string)–Field 4: LAST_N (string)–Field 5: SALARY (double)
– Mission: refer to the DB from a high level language application.
3/21
SQL Queries – The Old waycan you spot the bugs?
char* get_employees(int dept, char* first) {
bool first_cond = true;
string s(“SELECT FIRST_N, LAST_N FROM EMPLOYEES “);
if (dept > 0){ // valid dept number
s.append( “WHERE DEPT = ‘ “);
s.append(itoa(dept)); s.append(“’”);
first_cond = false;
}
if (first == null) return s;
if (first_cond) s.append(“WHERE “);
else s.append(“AND”);
s.append(“FIRST_N= ‘ “); s.append(first); s.append(“’”);
return s;
}
1. Misspelled Name
2. Syntax error
3. Type Mismatch
4. Security Vulnerability , 5. Code Coverage ,6. Maintnance cost
4/21
SQL Queries - ARARAT’s Way
1 char* get_employees(short dept, char* first) {
2 DEF_V(e,EMPLOYEE[FIRST_N,LAST_N]);
3 if (first != null) e /= (FIRST_N == first);
4 if (dept > 0) e /= (DEPT == dept);
5 return e.asSQL();
6 }
Short and elegant code
Type safe
Schema awareness – No misspelling
Automatic generation of queries – no syntax errors
Natural C++ Syntax
Security Vulnerability, Code Coverage , Maintenance cost Problems Solved
5/21
ARARAT’s Overview •GOAL : safe production of SQL queries.
•ARARAT = ARA + RAT:– ARA is Augmented Relational Algebra.– RAT is Relational Algebra Templates.
•In a sense, RAT is ARA’s compiler.
6/21
ARA’S Operators
RA Operator ARA Operator SQL equivalent
selection R/c select * from R where c
projection R[f1,f2] select f1,f2 from R
union R1+R2 R1 union R2
difference R1-R2 R1 - R2
natural join R1*R2 R1 join R2
left join R1<<R2 R1 left join R2
right join R1>>R2 R1 right join R2
rename b(a) a as b
RcRff 21 ,
21 RR
21 \ RR
21 RR
21 RR
21 RR Rba /
7/21
ARA's Principles
•Queries are composed via Query Objects • C++ first class objects.
•A query object has:• Type – Encodes scheme of the resulting relation
• Same set of fields => same type• Content – the procedure for constructing the query
8/21
Primitive Query Objects
–A Primitive QO is defined for each DB table–Option 1: Manually, using macros
–Option 2: Automatically by the DB2ARA tool
–Represent a query that returns all fieldsSELECT * FROM <RELATION_NAME>
DEF_F(EMPNUM)DEF_F(DEPT)DEF_F(FIRST_N)DEF_F(LAST_N)DEF_F(SALARY)
DEF_R(EMPLOYEE, (EMPNUM/integer, FIRST_N/String, LAST_N/String, DEPT/SmallInt, SALARY/double));
9/21
Composite Query Objects
•Constructed using ARA's operators
The operands:–QOs (either primitive or composite)–fields–variables –literals
DEF_V(e,EMPLOYEE[FIRST_N,LAST_N] / (DEPT > 3));
10/21
A QO Can ...•... be stored in a variable•... receive the asSQL() message•... be used in an ARA expression•... be passed to the TUPLE_T macro
• returns the type of tuples that the query return
DEF_V(e,EMPLOYEE);e = EMPLOYEE / (DEPT == 3);string s = e.asSQL();DEF_V(e1,e[FIRST_N, LAST_N]);e1 /= (EMPNUM < 100); TUPLE_T(e)** res = new TUPLE_T(e)*[100];
11/21
Using ARA’s Syntax
#include "rat" // Global RAT declarations and macros #include "employee.h" // Primitive query objects and other
DEF_F(FULL_N); DEF_F(ID);
int main(int argc, char* argv[]) { const string s = ( (EMPLOYEE / (DEPT > 3 && SALARY < 3.14)) [FIRST_N, LAST_N,
FULL_N(cat(LAST_N, ", ", FIRST_N)), ID(EMPNUM)] ).asSQL(); // ... execute the SQL query in s using e.g., ADO. return 0; }
12/21
So Far…
ARA provides an elegant way for composing queries…
How does it do that?Do we need to change the C++ compiler ?
13/21
Template Programming•C++ templates are Turing-complete
– Can compute anything at compile-time
template <int N> struct Factorial {
enum { value = N * Factorial<N - 1>::value }; };
template <> struct Factorial<0> {
enum { value = 1 };};
// Factorial<4>::value == 24 // Factorial<0>::value == 1void foo() {
int x = Factorial<4>::value; // == 24 int y = Factorial<0>::value; // == 1
}
14/21
A Compile-time List
struct Nil { static const int length = 0; };
template<class F, class R=Nil> struct List { typedef F First; typedef R Rest;
static const int length = 1 + Rest::length;};
typedef List<int, List<short, List<double, List<float, Nil> > > > typesList;
15/21
Implementation–Take I
•Both scheme and content are compile-time constants:
•Problem:• Every query has its own type• Query objects are immutable
template<typename FIELDS, typename COND …>
class QueryObject { public: typedef FIELDS fields; typedef COND cond; …};
QueryObject< LIST(EMPNUM, DEPT, FIRST_N, LAST_N), GREATER(DEPT, 3)> q;
16/21
Implementation–Take II
•Dual query representation
• Only the result scheme is encoded at compile time• Allows: e /= (DEPT < 3);
•Problem:• Projected-out fields are not part of the type
• Cant be used in selection conditions
template<typename FIELDS >class QueryObject { public: typedef FIELDS fields; R_TREE * t;};
Stores the evaluation procedure.This is a run-time value
17/21
Implementation–Take III
•Projection Resilience
•Allows SELECT criteria with fields that were projected out
template<typename FIELDS , typename DICTIONARY>class QueryObject{ public: typedef FIELDS fields; R_TREE * t; typedef DICTIONARY dict;
... };
Saves only Active Fields that dictate the
result relation type
Saves all the FIELDs we had so far
DEF_V(e,EMPLOYEE[FIRST_N,LAST_N]);
e /= (DEPT == dept);
18/21
Type Checking
– Make sure all of E’s FIELDS exist in Q’s dictionary– Fetch E’s FIELD types from Q.
– Enables using the same field name with distinct types in different tables
– Resolve e’s type. – It must be bool
–Incorporate e’s content, e.t, into q.t– Changes q’s content but not it’s type– e has no compile time representation in the q
template<typename Q, typename E>Q operator/(Q& q, const E& e);
19/21
Type Equivalence
•RAT builds a Type for every query– This type relies on the FIELDs in the return relation
•Problem: Schema FIELD order
•Solution: __COUNTER__– a unique ID for every FIELD in order to sort and achieve
relation equivalence.
20/21
•Extend expressive power– Updates to the database– Allow GroupBy
• Challenge: nested relations
•Embedding of other little languages in C++– XML
• Challenge: recursive types
•ARARAT in Java– Require compiler modifications
•Relational Algebra calculus for collections of objects.
Further Research
21/21
Questions?