Python for database access and Event driven programming in Python
Python in the database
-
Upload
pybcn -
Category
Engineering
-
view
642 -
download
4
Transcript of Python in the database
Who Am I
Brian Sutherland
Partner at Vanguardistas LLCWorked with PostgreSQL and Python for yearsUsed them to build Metro Publisher (SaaS)
What is PostgreSQL?
A non-NoSQL database…Actually an SQL databaseExtremely extensiblePerformantFirst released 1995Very good general purpose Database
But we’re here to talk about Triggers
Code which executes inside the database process in response to events
e.g. INSERT/UPDATE/DELETE a new row
Triggers
Used for● auditing/logging● sending email● validation● denormalization/cache● cache invalidation● replication
Triggers
PostgreSQL allows writing triggers in a number of languages:● C● Java● Javascript● Python● ...
Triggers
Use with caution● Principle of least astonishment
○ INSERT can send email● Transactions
○ Serialization Errors○ Idempotency○ Transaction Rollback
PL/Python
● Python 2 and 3● Basic Postgres types are converted to
Python● An “untrusted” language● One interpreter per database session
Calendaring Example
Web application for calendaring● Recurring events● Read queries must be fastHigh number of database reads compared to writes
Calendaring Example
Every Weekday at 3PM until 1 January 2020>>> from dateutil.rrule import *
>>> list(rrule(DAILY,
byweekday=[MO, TU, WE, TH, FR],
dtstart=datetime(2014, 11, 10, 15),
until=datetime(2020, 1, 1)))
[datetime.datetime(2014, 11, 10, 15, 0),
datetime.datetime(2014, 11, 11, 15, 0),
datetime.datetime(2014, 11, 12, 15, 0),
…]
Calendaring Example
Naïve Implementation● Store only the rule in the database● On display, expand the rule using dateutil● Render calendar in HTML
Calendaring Example
FAILThere are 100 000 events in the database, find all events which occur between 3 and 4 PM today
Calculating………………………...
Calendaring Example
Find another way, use triggers to● Pre-calculate occurrences● Store them in another “cache” table● Use PostgreSQL indexes to make queries
fast
Calendaring Example
Trigger on the “event” generates occurrencesStore the occurrences in an “occ” table
Thanks to indexing, this query is FAST:SELECT * FROM occWHERE dtstart > X AND dtend < Y
Calendaring Example
PostgreSQL has a range type which makes things even faster:
SELECT * FROM occWHERE occuring && tsrange(X, Y)
Calendaring Example
Creating the triggerCREATE LANGUAGE "plpython2u";
CREATE FUNCTION event_occs () RETURNS trigger AS $$
from my.plpy.generate_event_occs import generate_event_occs
generate_event_occs(TD["new"])
return "OK"
$$ LANGUAGE plpython2u;
CREATE TRIGGER event_gen_occs BEFORE UPDATE OR INSERT ON
event FOR EACH ROW EXECUTE PROCEDURE event_occs();
Calendaring ExampleMuch simplified function:import plpy
def generate_event_occs(new):
d = plpy.prepare("DELETE FROM occ WHERE event_id=$1" , ["int"])
plpy.execute(d, [new[‘event_id’]])
i = plpy.prepare("INSERT INTO occ VALUES ($1,$2)", ["int", “tsrange”])
for period in rrule(new):
plpy.execute(i, [new[‘event_id’], period])
JSON Validation exampleMuch simplified function:
CREATE FUNCTION check_foo() RETURNS trigger AS $$
from json import loads
foo = loads(TD["new"]["foo"])
if "type" not in foo or foo["type"] not in ["a", "b"]:
raise Exception("Invalid Type")
return "OK"
$$ LANGUAGE plpython2u;
Best Practices
Immediately import and call a python functionCREATE FUNCTION event_occs () RETURNS trigger AS $$
from my.plpy.generate_event_occs import generate_event_occs
generate_event_occs(TD["new"])
return "OK"
$$ LANGUAGE plpython2u;
Best Practices
Import time can kill performance as modules are re-imported every database connection
The ugly
● Except for some very basic types, Python 2 gets fed byte strings in the “database encoding”
● A little better in Python 3 which gets unicode● Debugging is interesting... (try running PDB
inside the PostgreSQL process)
The REALLY ugly (Fixed?)ERROR: Exception: oopsCONTEXT: Traceback (most recent call last):
PL/Python function "generate_event_occs", line 3, in <module> return generate_event_occs(event, rrule, SD) PL/Python function "generate_event_occs", line 256, in generate_event_occs PL/Python function "generate_event_occs", line 73, in generate_occurrences PL/Python function "generate_event_occs", line 97, in generate PL/Python function "generate_event_occs", line 424, in _handle_byday PL/Python function "generate_event_occs", line 206, in resolvePL/Python function "generate_event_occs"PL/pgSQL function content.generate_event_occs() line 7 at assignmentSQL statement "UPDATE content.event SET dtstart_occs=NULL WHERE uuid=ev_uuid"PL/pgSQL function content.event_rrule_set_occ_bounds() line 12 at SQL statement
Conclusions
VERY useful for complex code in the database if you already program in pythonPython has a lot of libraries which can be used
It has warts, but is a lifesaver
Questions?
Documentation:
http://www.postgresql.org/docs/devel/static/plpython.html