Relational Database Access with Python ‘sans’ ORM
-
Upload
mark-rees -
Category
Technology
-
view
2.450 -
download
0
description
Transcript of Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM
Mark ReesCTO
Century Software (M) Sdn. Bhd.
Your Current Relational Database Access Style?
# Django ORM>>> from ip2country.models import Ip2Country
>>> Ip2Country.objects.all()[<Ip2Country: Ip2Country object>, <Ip2Country: Ip2Country object>, '...(remaining elements truncated)...']
>>> sgp = Ip2Country.objects.filter(assigned__year=2012)\... .filter(countrycode2='SG')
>>> sgp[0].ipfrom1729580032.0
Your Current Relational Database Access Style?
# SQLAlchemy ORM>>> from sqlalchemy import create_engine, extract>>> from sqlalchemy.orm import sessionmaker>>> from models import Ip2Country
>>> engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2country')>>> Session = sessionmaker(bind=engine)>>> session = Session()
>>> all_data = session.query(Ip2Country).all()
>>> sgp = session.query(Ip2Country).\... filter(extract('year',Ip2Country.assigned) == 2012).\... filter(Ip2Country.countrycode2 == 'SG')
print sgp[0].ipfrom1729580032.0
SQL Relational Database Access
SELECT * FROM ip2country;
"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1729522688;1729523711;"apnic";"2011-08-05";"CN";"CHN";"China"1729523712;1729524735;"apnic";"2011-08-05";"CN";"CHN";"China”. . .
SELECT * FROM ip2countryWHERE date_part('year', assigned) = 2012AND countrycode2 = 'SG';
"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1729580032;1729581055;"apnic";"2012-01-16";"SG";"SGP";"Singapore"1729941504;1729942527;"apnic";"2012-01-10";"SG";"SGP";"Singapore”. . .
SELECT ipfrom FROM ip2countryWHERE date_part('year', assigned) = 2012AND countrycode2 = 'SG';
"ipfrom"17295800321729941504. . .
Python + SQL == Python DB-API 2.0
• The Python standard for a consistent interface to relational databases is the Python DB-API (PEP 249)
• The majority of Python database interfaces adhere to this standard
Python DB-API UML Diagram
Python DB-API Connection Object
Access the database via the connection object• Use connect constructor to create a
connection with databaseconn = psycopg2.connect(parameters…)
• Create cursor via the connectioncur = conn.cursor()
• Transaction management (implicit begin)conn.commit()conn.rollback()
• Close connection (will rollback current transaction)
conn.close()• Check module capabilities by globals
psycopg2.apilevel psycopg2.threadsafety psycopg2.paramstyle
Python DB-API Cursor Object
A cursor object is used to represent a database cursor, which is used to manage the context of fetch operations.• Cursors created from the same connection
are not isolatedcur = conn.cursor()cur2 = conn.cursor()
• Cursor methodscur.execute(operation, parameters) cur.executemany(op,seq_of_parameters)cur.fetchone()cur.fetchmany([size=cursor.arraysize])cur.fetchall()cur.close()
Python DB-API Cursor Object
• Optional cursor methodscur.scroll(value[,mode='relative']) cur.next()cur.callproc(procname[,parameters])cur.__iter__()
• Results of an operationcur.descriptioncur.rowcountcur.lastrowid
• DB adaptor specific “proprietary” cursor methods
Python DB-API Parameter Styles
Allows you to keep SQL separate from parameters
Improves performance & security
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
From http://initd.org/psycopg/docs/usage.html#query-parameters
Python DB-API Parameter Styles
Global paramstyle gives supported style for the adaptor
qmark Question mark styleWHERE countrycode2 = ?
numeric Numeric positional styleWHERE countrycode2 = :1
named Named styleWHERE countrycode2 = :code
format ANSI C printf format styleWHERE countrycode2 = %s
pyformat Python format style WHERE countrycode2 = %(name)s
Python + SQL: INSERTimport csv, datetime, psycopg2conn = psycopg2.connect("dbname=ip2country user=ip2country_rw password=secret”)cur = conn.cursor()with open("IpToCountry.csv", "rb") as f: reader = csv.reader(f) try: for row in reader: print row if row[0][0] != "#": row[3] = datetime.datetime.utcfromtimestamp(float(row[3])) cur.execute("""INSERT INTO ip2country( ipfrom, ipto, registry, assigned, countrycode2, countrycode3, countryname) VALUES (%s, %s, %s, %s, %s, %s, %s)""", row) except: conn.rollback() else: conn.commit() finally: cur.close() conn.close()
Python + SQL: SELECT# Find ipv4 address ranges assigned to Singaporeimport psycopg2, socket, struct
def num_to_dotted_quad(n): """convert long int to dotted quad string http://code.activestate.com/recipes/66517/""" return socket.inet_ntoa(struct.pack('!L',n))
conn = psycopg2.connect("dbname=ip2country user=ip2country_rw password=secret")
cur = conn.cursor()
cur.execute("""SELECT * FROM ip2country WHERE countrycode2 = 'SG' ORDER BY ipfrom""")
for row in cur: print "%s - %s" % (num_to_dotted_quad(int(row[0])), num_to_dotted_quad(int(row[1])))
SQLite
• sqlite3• CPython 2.5 & 3• DB-API 2.0• Part of CPython distribution since 2.5
PostgreSQL
• psycopg• CPython 2 & 3• DB-API 2.0, level 2 thread safe• Appears to be most popular• http://initd.org/psycopg/
• py-postgresql• CPython 3• DB-API 2.0• Written in Python with optional C
optimizations• pg_python - console• http://python.projects.postgresql.org/
PostgreSQL
• PyGreSQL• CPython 2.3+• Classic & DB-API 2.0 interfaces• http://www.pygresql.org/• Last release 2009
• pyPgSQL• CPython 2• Classic & DB-API 2.0 interfaces• http://www.pygresql.org/• Last release 2006
PostgreSQL
• pypq• CPython 2.7 & pypy 1.7+• Uses ctypes• DB-API 2.0 interface• psycopg2-like extension API• https://bitbucket.org/descent/pypq
• psycopg2ct• CPython 2.6+ & pypy 1.6+• Uses ctypes• DB-API 2.0 interface• psycopg2 compat layer • http://github.com/mvantellingen/
psycopg2-ctypes
MySQL
• MySQL-python• CPython 2.3+• DB-API 2.0 interface• http://sourceforge.net/projects/mysql-
python/• PyMySQL• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• http://www.pymysql.org/
• MySQL-Connector• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• https://launchpad.net/myconnpy
Other “Enterprise” Databases
• cx_Oracle• CPython 2 & 3• DB-API 2.0 interface• http://cx-oracle.sourceforge.net/
• informixda• CPython 2• DB-API 2.0 interface• http://informixdb.sourceforge.net/• Last release 2007
• Ibm-db• CPython 2• DB-API 2.0 for DB2 & Informix• http://code.google.com/p/ibm-db/
ODBC
• mxODBC• CPython 2.3+• DB-API 2.0 interfaces• http://www.egenix.com/products/pytho
n/mxODBC/doc
• Commercial product
• PyODBC• CPython 2 & 3• DB-API 2.0 interfaces with extensions• http://code.google.com/p/pyodbc/
• ODBC interfaces not limited to Windows thanks to iODBC and unixODBC
Jython + SQL
• zxJDBC• DB-API 2.0 Written in Java using JDBC
API so can utilize JDBC drivers• Support for connection pools and JNDI
lookup• Included with standard Jython
installation http://www.jython.org/• jyjdbc• DB-API 2.0 compliant• Written in Python/Jython so can utilize
JDBC drivers• Decimal data type support• http://code.google.com/p/jyjdbc/
IronPython + SQL
• adodbapi• IronPython 2+• Also works with CPython 2.3+ with
pywin32• http://adodbapi.sourceforge.net/
Gerald, the half a schema
import geralds1 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2country')s2 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2countryv4')
print s1.schema['ip2country'].compare(s2.schema['ip2country'])DIFF: Definition of assigned is differentDIFF: Column countryname not in ip2countryDIFF: Definition of registry is differentDIFF: Column countrycode3 not in ip2countryDIFF: Definition of countrycode2 is different
• Database schema toolkit• via DB-API currently supports• PostgreSQL• MySQL• Oracle
• http://halfcooked.com/code/gerald/
SQLPython
$ sqlpython --postgresql ip2country ip2country_rwPassword: 0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG';...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore 551 rows selected.0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG'\j[...{"ipfrom": 1728830464.0, "ipto": 1728830719.0, "registry": "apnic”,"assigned": "2011-11-02", "countrycode2": "SG", "countrycode3": "SGP", "countryname": "Singapore"}]
• A command-line interface to relational databases• via DB-API currently supports• PostgreSQL• MySQL• Oracle
• http://packages.python.org/sqlpython/
SQLPython, batteries included0:ip2country_rw@ip2country> select * from ip2country where countrycode2 ='SG’;...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore 551 rows selected.0:ip2country_rw@ip2country> pyPython 2.6.6 (r266:84292, May 20 2011, 16:42:25) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
py <command>: Executes a Python command. py: Enters interactive Python mode. End with `Ctrl-D` (Unix) / `Ctrl-Z` (Windows), `quit()`, 'exit()`. Past SELECT results are exposed as list `r`; most recent resultset is `r[-1]`. SQL bind, substitution variables are exposed as `binds`, `substs`. Run python code from external files with ``run("filename.py")`` >>> r[-1][-1](1728830464.0, 1728830719.0, 'apnic', datetime.date(2011, 11, 2), 'SG', 'SGP', 'Singapore')>>> import socket, struct>>> def num_to_dotted_quad(n):... return socket.inet_ntoa(struct.pack('!L',n))...>>> num_to_dotted_quad(int(r[-1][-1].ipfrom))'103.11.220.0'
SpringPython – Database Templates# Find ipv4 address ranges assigned to Singapore# using SpringPython DatabaseTemplate & DictionaryRowMapper
from springpython.database.core import *from springpython.database.factory import * conn_factory = PgdbConnectionFactory( user="ip2country_rw", password="secret", host="localhost", database="ip2country")dt = DatabaseTemplate(conn_factory)
results = dt.query( "SELECT * FROM ip2country WHERE countrycode2=%s", ("SG",), DictionaryRowMapper())
for row in results: print "%s - %s" % (num_to_dotted_quad(int(row['ipfrom'])), num_to_dotted_quad(int(row['ipto'])))
DB-API 2.0 PEP http://www.python.org/dev/peps/pep-0249/
Travis Spencer’s DB-API UML Diagram http://travisspencer.com/
Andrew Kuchling's introduction to the DB-API http://www.amk.ca/python/writing/DB-API.html
Attributions
Andy Todd’s OSDC paper http://halfcooked.com/presentations/osdc2006/python_databases.html
Source of csv data used in examples from WebNet77 licensed under GPLv3 http://software77.net/geo-ip/
Attributions
Mark Reesmark at centurysoftware dot com dot my
+Mark Rees@hexdump42
hex-dump.blogspot.com
Contact Details