Custom ETL Software
-
Upload
tracy-brown -
Category
Documents
-
view
160 -
download
4
Transcript of Custom ETL Software
Tracy Steven Brown
Custom Extract, Transform & Load (ETL) Software
WinForms/C# ETL Biomedical Informatics Software
The following project highlights an example of custom
biomedical informatics extraction, transformation &
loading (ETL) software designed and developed by Tracy
Steven Brown in his Appointed Professional position
within the prestigious University of Arizona Center for
Biomedical Informatics and Biostatistics under the
leadership of Don Saner, Assistant Chief Knowledge
Officer.
This ETL software – codenamed: Project Casanova,
extracts lengthy and complex data from EPIC Clarity, an
extensive hospital electronic medical record system; data
from EPIC Clarity is transformed by C# with T-SQL and
loaded into the University of Arizona Translational
Sciences REDCap Research Database via the HTTP REDCap
API using JavaScript Object Notation (JSON). REDCap
(Research Electronic Data Capture) is a mature, secure web application for building and managing online surveys and
databases.
Project Casanova uses a set of Microsoft SQL Server User-Defined Functions (UDF) to extract data points based on patient
medical record numbers and hospital admission dates. The application user interface uses custom drawn objects via the
Windows GDI+ graphics library coupled with traditional third-party Telerik controls. The main Windows UI Thread is
unburdened with lengthy database lookups by offloading ETL operations to an asynchronous background thread.
Background threads call a progress method within the main UI thread to perform progress meter updates during
asynchronous operation.
The background ETL worker thread supports cancellation and
gracefully terminates asynchronous operations. Cancelled
operations and error conditions repaint the main application user
interface appropriately notifying the condition to the user.
Successful conditions repaint lines and recolor shading with green,
error conditions repaint with red, amber highlights events, and
purple represents dry-run practice runs.
The activity log, located on the bottom of the screen, updates
asynchronously while the ETL worker thread increments through a
complex set of data manipulation tasks. Text is color coded to aid
the user in identifying conditions associated with application
progress.
Tracy Steven Brown
Database Development & ETL Style
There are several data capture instruments within the REDCap
Electronic Data Capture System – each instrument or form has
a corresponding Microsoft SQL User-Defined Function (UDF).
The WinForms/C# source code iterates through each of the
data instrument SQL functions visually updating a progress
meter and reporting progress within the application log. The
application checks for incomplete status for each instrument
prior to extracting data from EPIC Clarity. After data has been
transformed and loaded, the application sets the instrument
status to “unverified” to ensure that a sanity check is
performed by clinical practitioners.
The UDF syntactic style is based on inner and outer common
table expressions (CTE) – an alternate technique to the use of
inner-views. Although several inner-views are used within the
common table expressions to cleanly extract desired data
points or to perform necessary calculations on extracted data.
The example here is a very small subset of the larger SQL
codebase showcasing an outer apply with an embedded cross
apply wrapped within an inner-view to extract several
important clinical data points. The code shown here also
highlights Tracy’s technique for extracting laboratory values
that most closely match a given event time – in this case, a
blood draw for the University of Arizona Biorepository. For
instance, the ‘DISTANCE’ variable in this example represents
the absolute time for a secondary event occurring before or
after an indicated primary event. Its use coupled with an
ORDER BY and OFFSET FETCH extracts the desired value.
Tracy Steven Brown
Software Development – In Design
The development process for Project Casanova
was very organic in nature beginning with a couple
notebook pages with sketches of the potential
product.
While the final product doesn’t reflect the exact
sketch – it does, however, serve to illustrate how
Tracy began the project and his initial thoughts and
plans.
The sketches show the planned C#/WinForms user
interface as well as a C#/WinForms User Control.
Notes are scattered around the illustration as
reminders and ideas for the project.