Skills Portfolio
description
Transcript of Skills Portfolio
Roderick N. LeeSkills Portfolio
•Business Intelligence
•Data Warehousing
•Decision Analysis
•Database Development
•Perl Scripting
[email protected]://www.linkedin.com/in/RoderickNLee
(714) 893 8727 H(714) 785 8479 M
Table of Contents
Technical Strengths 3ETL 4
commercial software (SSIS) 5in-house proprietary 11
Data Warehousing and Analysis 14Business Intelligence 19
reporting 20dashboards 22
Software Development 26Oracle PL/SQL 27Perl 31
Technical Strengths
10+ Years:
Certification:
ETL
10+ Years of Multi-Platform ETL experience:
• Business Objects Data Integrator
• SQL Server Integration Services (SSIS)
• In-house PL/SQL “hand coding”
ETL – Master Control FlowMaster control flow populating a small staging database(example using SSIS)
It consists of an ETL container calling a set of related external packages utilizing the Execute Package functionality in SSIS, and a series of administrative tasks: backup, database shrink, and index rebuild
Each administrative task has a failure e-mail notification along with a final Success task for the entire package
ETL – Package Container
Zoom into the package container
The process for populating the staging database demonstrates the dependencies between the different dimensional hierarchies.
For example, the timesheet load requires both the employee and project loads as prerequisites before it can execute.
ETL – Employee Data Flow
Sample Data Flow:Employees
In the event of truncation, the process writes these cases to a warning file before reintegrating them into the main flow.
The employee data sources from a CSV flatfile and checks for a truncation issue on a new derived column, the full name (consisting of first name appended to last name).
ETL – Employee Data FlowEmployee Data Flow (continued)
Following the truncation check, the data flow looks up the existing employee table to determine if a given input will be an insert or an update (or an error).
ETL – Timesheets Control Flow
This one uses a loop container to read multiple timesheet inputs from the same directory before sending off either a success or failure e-mail notification.
Another control flow in the same package
ETL – Slowly Changing DimensionsCDC for Slowly Changing Dimension
This example, implementing Change Data Capture, utilizes a Slowly Changing Dimension transform to sort out inserts and updates.
Note, the source table for this data flow is the CDC table
Another example from a different package for Change Data Capture
ETL – Manual “Hand Coding”
In addition, multiple engagements coding manual ETL or data migration solutions in various industries:
• Financial data warehouses for several international and domestic banks
• Entertainment reporting data mart for a top studio• Clickstream data warehouse (code sample follows)
• Prototyped ETL algorithms for a streaming advertising data warehouse (final product was in Java)
• Loss prevention liability reporting data warehouse for a leading health care provider
• Trigger-based data migration to populate a retail operational data store for a Fortune 500 corporate reseller
ETL – Oracle PL/SQLClickstream Data Warehouse
Source is a Web traffic transactional system. Databases are not linked, so ETL is a combination of Oracle Import/Export (into a staging database) and PL/SQL Packages and Stored Procedures. This procedure calculates several daily clickstream statistics.
procedure p_clicks_daily_sum (p_day_ID in number)is[...]
cursor c_pages_per_hour isselect min(count(click_ID)) min_cnt_clicks , max(count(click_ID)) max_cnt_clicks , min(count(distinct URL)) min_cnt_unique_page , max(count(distinct URL)) max_cnt_unique_page , sum(count(distinct URL)) / cn_hours_per_day mean_uq_page_viewed_cnt from com_daily_ssn_details cdsd where com_ID = v_com_ID and click_day_ID = p_day_ID group by click_hour;
Cursor illustrates several calculations for the daily clickstream statistics
ETL – Oracle PL/SQL
Daily Clicks Summary
Process employs a cursor loop to load the daily clickstream facts by community ID (the top-level dimension) and time
beginv_cal_date := mat_stage.get_calendar_date(p_day_ID);for r_com in c_communities loop
[...]begin
insert into clicks_daily_sumvalues
( p_day_ID, r_com.com_ID, f_get_partition_key(v_cal_date), v_min_cnt_clicks, v_max_cnt_clicks, v_min_cnt_unique_page, v_max_cnt_unique_page, v_mean_uq_page_viewed_cnt, [more columns...], sysdate);
exceptionwhen others then
DWH.DWH_process.write_errors ( parameters );end;
end loop;end p_clicks_daily_sum;
(continued)
10+ Years of Data Warehousing experience:
• Business Objects Universe Designer
• SQL Server Analysis Services (SSAS) (examples
follow)
Data Warehousingand OLAP Analysis
Warehouse Implementation and Design in Oracle, SQL Server, Informix and MySQL
Analysis Services
The staging database that serves as the data source for the OLAP cube, a result of the preceding SSIS example.The model has five dimensional hierarchies and four fact tables.
Analysis ServicesOLAP Cube Design
This screenshot illustrates the relationships between the five dimensional hierarchies and four fact tables
Analysis ServicesCalculations and KPIs
The calculated member and KPI design tabs
Both images highlight the detail behind the KPI for Overhead as a Percentage of Total Cost
Analysis Services – KPIThe end result, using Excel as the reporting client, demonstrating Overhead as a Percentage of Total Cost by Jobs
10+ Years of Business Intelligence experience:
• Business Objects WebIntelligence (WebI)
• Business Objects Dashboard Manager
• SQL Server Reporting Services (SSRS)
• PerformancePoint Dashboard Designer
Business Intelligence:Query and Reporting
(examples follow)
(examples follow)
Report DesignOverhead Category Report(example using SSRS)
The two extra datasets calculate the previous quarter and set the default to the most recent quarter.
Note, as this is overhead, negative values for percentage change are good (black) and positive values are bad (red).
Simple Tabular ReportOverhead Category Report(continued)
Basic report showing current and previous quarter’s overhead and the percentage change, accepting the current quarter as an input parameter.
Also shows the use of a SharePoint site collection as the distribution medium.
Dashboard DesignPerformancePoint Server Dashboard Designer
Scorecard with two KPIs, including the same Overhead Percentage as earlier
Dashboard design, showing a single filter that links to the scorecard on the left. The Financials scorecard is in the right zone and does not use the filter.
Scorecard DashboardThe final result
Again, the distribution medium is SharePoint
Analytic DashboardEmployee Labor Analysis
Second example showing both an analytic chart and grid on the same dashboard page.
Note, the dual Y-axes that the chart utilizes.
SharePoint Web PartJob Profitability
Another dual axis example, this one using Excel Services to display the chart in SharePoint and an Analysis Services Web Part filter
Software Development
10+ Years of Software and Database Development experience:
• Radio website content management system using Oracle 8i Application Server
• Perl-based Reporting and ETL system
Code Samples:
Oracle Application ServerWeb Content Management
Mock-up of primary page lists all items for each content area of the website
procedure magazine_list ( [parameter list] )is
begin-- Piece the dynamic query together.
v_query := v_select || v_where || v_ord_grp; open c_pages for v_query; fetch c_pages into v_resultset(v_resultset.count+1); while not c_pages%NOTFOUND loop fetch c_pages into v_resultset(v_resultset.count+1); end loop; v_row_count := DBMS_SQL.last_row_count; close c_pages;
-- Build the html frame. (does not include the navigation frame) htp.p (' <html> <head> <title>Untitled Document</title> <meta http-equiv="Content-Type“ content="text/html"> <script language="JavaScript">
[...] ‘); for i in v_first_result..least(v_last_result, v_resultset.count) loop htp.p(‘<tr> <td>' || v_title || '</td> <td class=“list_item">' || v_resultset(i).ID || '</td> <td class=“list_item">' || to_char(v_resultset(i).start_date, 'fmmm/dd/yyyy') || '</td> <td class=“list_item">' || to_char(v_resultset(i).end_date, 'fmmm/dd/yyyy') || '</td> <td class=“list_item">' || v_edit_pvw_del || '</td> </tr>‘); end loop;end magazine_list;
Oracle Application ServerWeb Content Management (continued)
procedure nav_load_and_save_content_mag ( p_graphics_path in varchar2 := cn_graphics_path ) is
begin htp.p('function load_content() {
isSelected(''content'');final_form = document.final_page;contentform = top.contentFrame.document.content;for (i=0; i < author_list.length; i++) { contentform.author.options[contentform.author.options.length] =
author_list[i];}preload_select(contentform.author, final_form.P_PGE_CMEM_ID.value);contentform.leadin.value = final_form.P_PGE_LEAD_IN.value;contentform.body.value = final_form.P_PGE_BODY.value;
}
Oracle Application ServerWeb Content Management – Edit Content
The actual content management utility employs Javascript to load content text and related data into static HTML subpages
function save_content(direction) {final_form = document.final_page;contentform = top.contentFrame.document.content;if (get_selected(contentform.author)!="") document.content_change.src=''‘ || p_graphics_path || 'stus_yes.gif'';else document.content_change.src=''‘ || p_graphics_path || 'stus_no.gif'';final_form.P_PGE_CMEM_ID.value = get_selected(contentform.author);final_form.P_PGE_LEAD_IN.value = contentform.leadin.value;rawText = contentform.body.value;encodedText = "";if (contentform.leadin.value.length >1990) { alert("The lead-in text area cannot handle more than 2000
characters.\nPlease reduce its size."); return false;}final_form.P_PGE_BODY.value = encodedText;save_redirect(direction);
}');end;
Oracle Application ServerWeb Content Management – Edit Content (continued)
sub a1 {print "<table border='1' bordercolor='#FFFFFF' cellspacing='0' cellpadding='2'>"; [header items]print "</table>";[…]my $sth3 = $dbh->prepare(' SELECT date_id, SUM( registrations ), SUM( ath ), SUM( listeners ), SUM( sessions ), SUM( ads_served ), SUM( ads_missed ) FROM a1 WHERE MONTH(date_ID)= ? AND YEAR(date_ID)= ? GROUP by date_ID ORDER by date_ID DESC') or die "Couldn't prepare statement: " . $dbh->errstr;
PerlPerl CGI Report Delivery for Web Streaming Advertising
Note, the $dbh handle indicating the use of the DBI module to handle database agnostic SQL functionality
while (@data3 = $sth3->fetchrow_array()){ $reg_tot = $reg_tot + $data3[1];
$ath_tot = $ath_tot + $data3[2]; $listeners_tot = $listeners_tot + $data3[3]; $sessions_tot = $sessions_tot + $data3[4]; $ads_served_tot = $ads_served_tot + $data3[5];
$rowcount + rowcount + 1;}print "<TR>";
print "<TD align='center' ><B>Total</B></TD>";print "<TD align='right' ><B>$reg_tot</TD></B>";print "<TD align='right' ><B>$ath_tot</TD></B>";print "<TD align='right' ><B>$listeners_tot</TD></B>";print "<TD align='right' ><B>$sessions_tot</TD></B>";print "<TD align='right' ><B>$ads_served_tot</TD></B>";print "</TR>";if ($sth3->rows == 0){ $nodata = "No Data Found in Daily Report.";}$sth3->finish;$dbh->disconnect;print "</table>";
}
PerlPerl CGI Report Delivery (continued)