2011 Educational Forum “Schedule At-a-Glance · 2011 Educational Forum “Schedule At-a-Glance"...

22
2011 Educational Forum “Schedule At-a-Glance" Sunday, November 6 th Time Event 8:00am - 5:00pm Pre-Forum training seminars 4:00pm - 6:00pm Registration opens 5:00pm - 6:00pm Welcome Mixer Monday, November 7 th Time Event 7:00am - 5:00 pm Registration 7:00am - 8:00am Continental Breakfast available 8:00am - 8:30am Opening session 8:30am - 10:00am Presentations 10:00am - 10:15am Break (Refreshments provided for attendees) 10:15am - 12:15pm Presentations 12:15pm - 1:30pm Lunch (provided for attendees) and Keynote Address by Rick Langston 1:30pm - 3:00pm Presentations 3:00pm - 3:15pm Break (Refreshments provided for attendees) 3:15pm - 5:00pm Presentations 5:00pm - 6:00pm SAS Sponsored Mixer in the SAS Demonstration Room Tuesday, November 8th Time Event 7:00am - 8:00am Continental Breakfast available 8:00am - 10:00am Presentations 10:00am - 10:15am Break (Refreshments provided for attendees) 10:15am - 12:15pm Presentations 12:15pm - 1:30 pm Lunch (provided for attendees) 1:30pm - 3:00 pm Presentations 3:00pm - 3:15pm Break (Refreshments provided for attendees) 3:15pm - 3:45 pm Closing Session Wednesday, November 9 th Time Event 8:00am - 5:00pm Post-Forum training seminars SAS is a registered trademark of SAS Institute Inc.

Transcript of 2011 Educational Forum “Schedule At-a-Glance · 2011 Educational Forum “Schedule At-a-Glance"...

2011 Educational Forum “Schedule At-a-Glance"

Sunday, November 6th

Time Event

8:00am - 5:00pm Pre-Forum training seminars

4:00pm - 6:00pm Registration opens

5:00pm - 6:00pm Welcome Mixer

Monday, November 7th

Time Event

7:00am - 5:00 pm Registration

7:00am - 8:00am Continental Breakfast available

8:00am - 8:30am Opening session

8:30am - 10:00am Presentations

10:00am - 10:15am Break (Refreshments provided for attendees)

10:15am - 12:15pm Presentations

12:15pm - 1:30pm Lunch (provided for attendees) and Keynote Address by Rick Langston

1:30pm - 3:00pm Presentations

3:00pm - 3:15pm Break (Refreshments provided for attendees)

3:15pm - 5:00pm Presentations

5:00pm - 6:00pm SAS Sponsored Mixer in the SAS Demonstration Room

Tuesday, November 8th

Time Event

7:00am - 8:00am Continental Breakfast available

8:00am - 10:00am Presentations

10:00am - 10:15am Break (Refreshments provided for attendees)

10:15am - 12:15pm Presentations

12:15pm - 1:30 pm Lunch (provided for attendees)

1:30pm - 3:00 pm Presentations

3:00pm - 3:15pm Break (Refreshments provided for attendees)

3:15pm - 3:45 pm Closing Session

Wednesday, November 9th

Time Event

8:00am - 5:00pm Post-Forum training seminars

SAS is a registered trademark of SAS Institute Inc.

SCSUG Educational Forum 2011 November 6-9, 2011

Ft. Worth, TX

Schedule of Events

11/6/2011 Sunday - Pre-Forum Training

8:00 AM - 5:00 PM Advanced Reporting and Analysis Techniques for the SAS® Power User: It's Not Just About The PROCs!

Art Carpenter Caprock & Llano

8:00 AM - 12:00 PM PROC SQL Programming: The Basics and Beyond Kirk Lafler Grape Creek

1:00 PM - 5:00 PM Exploring SAS® Hash Programming Techniques Kirk Lafler Grape Creek

1:00 PM - 5:00 PM JMP® Basics for Exploring the World of Data Discovery Charles Shipp Pheasant Ridge

11/9/2011 Wednesday - Post-Forum Training

8:00 AM - 12:00 PM Output Delivery System: The Basics and Beyond Kirk Lafler Pheasant Ridge

8:00 AM - 12:00 PM Innovative Tips and Techniques: Doing More in the DATA Step Art Carpenter Caprock & Llano

1:00 PM - 5:00 PM Advanced SAS® Programming Techniques Kirk Lafler Pheasant Ridge

1:00 PM - 5:00 PM Getting Started with SAS Macro Language Basics Art Carpenter Caprock & Llano

Session Schedule Foundations & Fundamentals

Beyond the Basics Potpourri Information Reporting&

Statistics Application Development Hands-on-Workshops

Room Driftwood Grape Creek Caprock & Llano Pheasant Ridge Spicewood Sister Creek

Section Chair Kevin Davidson Kenny Bissett John Taylor Keith Cranford Minh Duong Debbie Buck

11/7/2011 7:00 AM

Contintental Breakfast (Outside Demo Room - Brushy Creek/Dry Comal Creek)

8:00 AM

Opening Session (Fall Creek & Flat Creek)

8:30 AM The Way of the Semicolon or Three Things I Wish I

Had Known Before I Ever Keyed One - Patricia

Hettinger

Win With SAS®, JMP®, and Special Interest User Groups - Charles Shipp

What's Hot, What's Not - Skills for SAS®

Professionals - Kirk Paul Lafler

Getting the Most Out of Your Conference - Andy

Kuligowski

9:00 AM

Choosing the Road Less Traveled: Performing

Similar Tasks with either SAS DATA Step

Processing or with Base SAS® Procedures - Kathryn McLawhorn

Becoming a Better Programmer with SAS® Enterprise Guide® 4.3 -

Michael Walters

You Could Be a SAS® Nerd If You . . . - Kirk Paul Lafler

Creating Complex Reports - Cynthia Zender

PROC TABULATE: Getting Started - Art

Carpenter (100 minutes)

9:30 AM

Using SAS help to Validate Attributes - Sadia Abdullah

10:00 AM Morning Break (Inside Demo Room - Brushy Creek/Dry Comal Creek)

10:15 AM

How to Read Data into SAS® with the DATA Step

- Toby Dunn and Kirk Lafler

The FILENAME Statement: Interacting

with the World Outside of SAS® - Chris Schacherer

The Top 10 Head Scratchers: SAS® Log

Messages That Prompt a Call to SAS Technical Support - Kim Wilson

An Introductory Tutorial on Mixed Models - Funda

Gunes

10:45 AM

11:15 AM

An Introduction to SAS® Hash Programming

Techniques - Kirk Paul Lafler

SAS for Performance: An introductory Tutorial -

Michael Welch 11:45 AM

12:15 PM

Lunch and Keynote Address "What's New In SAS 9.3" - Rick Langston (Provided to all registrants -- in Fall Creek & Flat Creek)

Session Schedule Foundations & Fundamentals

Beyond the Basics Potpourri Information Reporting&

Statistics Application Development Hands-on-Workshops

Room Driftwood Grape Creek Caprock & Llano Pheasant Ridge Spicewood Sister Creek

11/7/2011 1:30 PM

Ethnicity and Race: When Your Output Isn't What

You Expected - Philamer Atienza

At Random - Sampling with Proc Surveyselect -

Patricia Hettinger

Statistical Comparison of Relative Proportions of

Bacterial Genetic Sequences - Jose F.

Garcia-Mazcorro, et.al.

Intro to Longitudinal Data: A Grad Student "How-To"

Paper - Elisa Priest

2:00 PM

Looking Beneath the Surface of Sorting - Andrew Kuligowski

Top 10 Best SAS Programming Practices -

Sharu Shankar

CDISC ADaM Application: One-Record-Per-Subject Data That Doesn't Belong in ADSL - Sandra Minjoe

Prediction of Diabetes - Pardha Repalli

The Greatest Hits: ODS Essentials Every User Should Know - Cynthia

Zender

A Hands-on Tour Inside DATA Step and PROC

SQL Programming - Kirk Lafler

(100 minutes)

2:30 PM

Kass Adjustments in Decision Trees on Binary vs. Continuous - Manoj

Immadi

3:00 PM Afternoon Break… Afternoon Break (Inside Demo Room - Brushy Creek/Dry Comal Creek)

3:15 PM

The MEANS/SUMMARY Procedure:

Doing More - Art Carpenter

Staying Relevant in the Ever-Changing

Pharmaceutical Industry - Yang & Hoffman

Use of Decision, Cut-off and SAS Code Node in SAS Enterprise Guide While Scoring to Adjust Prior Probabilities and Prediction Cut-off for Separate Sampling -

Yogen Shah

Incorporating DataFlux dfPower Studio 8.2 into a

Graduate-Level Information Quality Tools

Class - Yinle Zhou 3:45 PM

A Well Designed Process and QC Tool for Integrated

Summary of Safety Reports - Chen

A SAS Macro Tool for Selecting Differentially Expressed Genes in

Microarray Data - Huanying Qin

4:15 PM

Basic SAS® PROCedures for Producing Quick Results - Kirk Lafler

Reading and Processing Mystery Data Sets - Jimmy

DeFoor

Evaluation of Promotional Campaigns in the Utilities Industry Using a Transfer

Function Time-Series Model - Fujiang Wen

SAS® Information Studio - Map Your Way Through

the Data - Farias, Alejandro 4:45 PM

Session Schedule Foundations & Fundamentals

Beyond the Basics Potpourri Information Reporting

& Statistics Application Development Hands-on-Workshops

Room Driftwood Grape Creek Caprock & Llano Pheasant Ridge Spicewood Sister Creek

11/8/2011 7:00 AM Contintental Breakfast (Outside Demo Room - Brushy Creek/Dry Comal Creek)

8:00 AM

Unveiling the Power of Cascading Style Sheets

(CSS) in ODS - Kevin Smith

You Can't Get There From Here If You Don't Know

Where Here Is. Improving the SAS Enterprise Guide

Data Characterization Task - Patricia Hettinger

Don’t Gamble with Your Output: How to Use

Microsoft Formats with ODS - Cynthia Zender

Advice to Health Services Researchers: Be Cautious

Using the 'Where' Statement in SAS

Programs for Nationally Representative Complex

Survey Data - Hemalkuma Mehta

SAS-Implementations Supporting Satellite,

Aircraft, and Drone-based Remote Sensing

Endeavors and Their Influences in the

Classroom - Cecil R. Hallum

Using INFILE and INPUT Statements to Introduce External Data into the

SAS® System - Andrew T. Kuligowski

(100 minutes)

8:30 AM Proc Surveyfreq: Why Do a Three Way Table in SAS When We Want Two Way

Table Information? - Hemalkuma Mehta

9:00 AM

Understanding the Anatomy of a SAS®

Deployment: What's in My Server Soup? - Donna

Bennett

They're Closing Down the Office in Kalamazoo and You've Been Tapped to

Run Their In-house "Production" Reporting

System, OH MY! - David Cherry Welch

Top Ten SAS® Sites for Programmers: A Review -

Kirk Paul Lafler

An Introductory Look at the Situational Context

Inherent in the RBI Using Two Modeling Approaches

- Ryan Sides

Tracking and Reporting Account Referral Activity using Hash tables and

SAS BI - James L Beaver

9:30 AM When It's Not Random

Chance: Creating Propensity Scores Using SAS EG - Josie Brunner

Exploring trends in topics via Text Mining

SUGI/Global Forum proceedings abstracts -

Zubair Shaik

10:00 AM Morning Break (Inside Demo Room - Brushy Creek/Dry Comal Creek)

10:15 AM

Proc Tabulate: Getting Started - Art Carpenter

SAS/STAT® 9.3 - Funda Gunes

Connect with SAS® Professionals Around the World with LinkedIn and

sasCommunity.org - Charles Edwin Shipp and

Kirk Paul Lafler

A Simulation Study for Power Analysis in a

Longitudinal Study Using SAS - Lingineni & Su

The easy way to include XML formatted data into your SAS reports - Mary

Grace Crissey

Practically Perfect Presentations - Cynthia

Zender (100 minutes)

10:45 AM Using Proc Logistic, SAS Macros and ODS Output

to Evaluate ... - Alexandros Vidras

11:15 AM

Mastering Non-Standard Data Sources with SAS: Using Patterns, Parsing,

and Text to Handle Difficult Files - Georgeanna N.

Glezen

Consulting: Critical Success Factors - Kirk

Paul Lafler

Take a Fresh Look at SAS® Enterprise Guide®: From point-and-click ad hocs to robust enterprise

solutions - Chris Schacherer

11:45 AM

Session Schedule Foundations & Fundamentals

Beyond the Basics Potpourri Information Reporting

& Statistics Application Development Hands-on-Workshops

Room Driftwood Grape Creek Caprock & Llano Pheasant Ridge Spicewood Sister Creek

11/8/2011 12:15 PM

Lunch (Provided to all registrants -- in Fall Creek & Flat Creek)

1:30 PM

Type Less, Do More: Have SAS do the typing for you -

Jeanina Worden

Can't Decide Whether to Use a DATA Step or PROC SQL? - Poling

Assigning a User-defined Macro to a Function Key -

Kirk Lafler “Let’s Get SAS

® To Do It!”

Getting Data from SUDAAN

® to SAS

® to

EXCEL® - Anna Vincent

Using a Free Microsoft tool to help manage your EBI

SAS server - Dan W Strickland

2:00 PM

Protecting Macros and Macro Variables:

It Is All About Control - Art Carpenter

Finding Oracle® Table Metadata: When PROC

CONTENTS Is Not Enough - Poling

Benefits of sasCommunity.org for

JMP® Coders - Charles Edwin Shipp and Kirk Paul

Lafler

2:30 PM

Traffic Lighting: The Next Generation - VanBuskirk

and Harper

Best Practices - Clean House to Avoid Hangovers

- Kirk Lafler

3:00 PM Afternoon Break (Outside Demo Room - Brushy Creek/Dry Comal Creek)

3:15 PM

Closing Session (Fall Creek & Flat Creek)

Meeting Room Map

SAS Presentations 11/7/2011

9 to 9:30 am Super Demo: Sandwich Your SAS dataset to Excel Pivot Tables Shankar, Sharu Demo Room

9 to 10 am Becoming a Better Programmer with SAS® Enterprise Guide® 4.3 Michael Walters Grape Creek

9 to 10 am Choosing the Road Less Traveled: Performing Similar Tasks with either SAS DATA Step Processing or with Base SAS® Procedures

Kathryn McLawhorn Driftwood

9 to Noon Half-day Seminar: Creating Complex Reports Cynthia Zender Spicewood

10 to 10:30 am Super Demo: New Features in PROC FORMAT for SAS 9.3 Rick Langston Demo Room

10:15 to 11:15 am The Top 10 Head Scratchers: SAS® Log Messages That Prompt a Call to SAS Technical Support

Kim Wilson Caprock & Llano

10:15 am to Noon Statistical Tutorial: An Introductory Tutorial on Mixed Models Funda Gunes Pheasant Ridge

11 to 11:30 am Super Demo: SAS 9.3 Deployment: A Sure Bet! Donna Bennett Demo Room

11:15 am to Noon SAS for Performance: An Introductory Tutorial Michael Patrick Welch

Caprock & Llano

Noon to 1:30 pm Keynote Address Rick Langston Fall Creek & Flat Creek

2 to 2:30 pm Super Demo: ODS Statistical Graphics in SAS 9.3 Funda Gunes Demo Room

2 to 3 pm The Greatest Hits: ODS Essentials Every User Should Know Cynthia Zender Spicewood

2 to 5 pm Half-day Seminar: Top 10 Best SAS Programming Practices Sharu Shankar Grape Creek

3 to 3:30 pm Super Demo: So, What is JMP®?: An Introduction to Interactive Analytics Jami Hampton Demo Room

4 to 4:30 pm Super Demo: New Features in PROC FORMAT for SAS 9.3 Rick Langston Demo Room

11/8/2011 8 to 8:30 am

Super Demo: ITRM - SAS IT Resource Management New Features Michael Patrick Welch

Demo Room

8 to 9 am Don’t Gamble with Your Output: How to Use Microsoft Formats with ODS Cynthia Zender

Caprock & Llano

8 to 9 am Unveiling the Power of Cascading Style Sheets (CSS) in ODS Kevin Smith

Driftwood

9 to 9:30 am Super Demo: Fitting Bayesian Random-Effects Models Using PROC MCMC Funda Gunes

Demo Room

9 to 10 am Understanding the Anatomy of a SAS® Deployment: What's in My Server Soup? Donna Bennett

Driftwood

10 to 10:30 am Super Demo: Sandwich Your SAS dataset to Excel Pivot Tables Sharu Shankar

Demo Room

10:15 to 11 am On Deck: SAS/STAT® 9.3 Funda Gunes

Grape Creek

10:15 to Noon Hands-on Workshop: Practically Perfect Presentations Cynthia Zender

Sister Creek

11 to 11:30 am Super Demo: Top Ten Steps to Prepare for a Migration Donna Bennett

Demo Room

Hands-on Workshop Schedule

11/7/2011

PROC TABULATE: Getting Started

Art Carpenter, California Occidental Consultants

9:00 am - 10:40 am

Although PROC TABULATE has been a part of Base SAS® since early version 6, this powerful analytical and reporting procedure is very under utilized. TABULATE is different; it’s step statement structure is unlike any other procedure. Because the programmer who wishes to learn the procedure must essentially learn a new programming language, one with radically different statement structure than elsewhere within SAS, many do not make the effort. The basic statements will be introduced, and more importantly the introduction will provide a strategy for learning the statement structure. The statement structure relies on building blocks that can be identified and learned individually and in concert with others. Learn how these building blocks form the structure of the statement, how they fit together, and how they are used to design and create the final report.

A Hands-on Tour Inside DATA Step and PROC SQL Programming

Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California

2:00 pm - 3:40 pm

Should the DATA step or SQL procedure be used to perform certain programming tasks? This hands-on workshop contrasts the similarities and differences between DATA step versus PROC SQL programming techniques including conditional logic concepts and constructs including IF-THEN-ELSE, SELECT-WHEN, and PROC SQL CASE expressions; and the techniques for constructing effective merges and joins. Attendees explore examples that contrast DATA step versus PROC SQL programming techniques to conduct conditional logic scenarios, one-to-one match-merges and match-joins, and an assortment of inner and outer joins programming techniques.

11/8/2011

Using INFILE and INPUT Statements to Introduce External Data into the SAS® System

Andrew T. Kuligowski

8:00 am - 9:40 am

The SAS

® System has numerous capabilities to store, analyze, report, and present data. However, those features are useless unless that data

is stored in, or can be accessed by, the SAS System. This presentation is designed to review the INFILE and INPUT statements. It has been set up as a series of examples, each building on the other, rather than a mere recitation of the options as documented in the manual. These examples will include various data sources, including DATALINES, sequential files, and CSV files.

Practically Perfect Presentations

Cynthia L. Zender, SAS Institute, Inc., Cary, NC

10:15 am - 11:55 am

PROC REPORT is a powerful reporting procedure, whose output can be "practically perfect" when you add ODS STYLE= overrides to your PROC REPORT code. This hands-on workshop will feature several PROC REPORT programs that produce default output for ODS HTML, RTF and PDF destinations. Workshop attendees will learn how to modify the defaults to change elements of PROC REPORT output, such as HEADER cells, DATA cells, SUMMARY cells and LINE output using ODS STYLE= overrides. In addition, attendees will learn how to apply conditional formatting at the column or cell level and at the row level using PROC FORMAT techniques and CALL DEFINE techniques. Other topics include: table attributes that control interior table lines and table borders, use of logos in output and producing "Page x of y" page numbering.

Application Development Creating Complex Reports Cynthia Zender – SAS Institute This 3-4 hour seminar is based on the SAS Global Forum paper of the same name (http://www2.sas.com/proceedings/forum2008/173-2008.pdf ). This seminar is for intermediate SAS programmers. In the seminar, we will investigate how eight (8) complex reports were produced with SAS. All the code that produced the reports will be covered, in detail. All report output is produced using ODS (rather than LISTING) output. The reports to be covered include three versions of a standard demographic report, producing a color-banded report with PROC TABULATE, producing a report which uses special fonts (Bissantz SparkFonts) to produce a sparkline report, several graph examples and several unique report ordering examples. Procedures/Topics to be covered include: REPORT, TABULATE, FORMAT, MEANS, FREQ, Macro processing and DATA _NULL_ programming (as used to produce the reports) . Refer to the SAS Global Forum paper to see the actual reports which will be discussed in detail. Intro to Longitudinal Data: A Grad Student "How-To" Paper E.L. Priest - University of North Texas School of Public Health and Baylor Health Care System Collinsworth, AW - Tulane University and Baylor Health Care System Grad students learn the basics of SAS programming in class or on their own. Although students may deal with longitudinal data in class, the lessons focus on statistical procedures and the datasets are usually ready for analysis. However, longitudinal data may be organized in many complex structures, especially if it was collected in a relational database. Once students begin working with “real” longitudinal data, they quickly realize that manipulating the data so it can be analyzed is its own time consuming challenge. In the real world of messy data, we often spend more time preparing the data than performing the analysis. Students need tools that can help them survive the challenges of working with longitudinal data. These challenges include identifying records, counting repeat observations, performing calculations across records, and restructuring repeating data from multiple observations to single observation. This paper will use real world examples from grad students to demonstrate useful functions, FIRST. and LAST. variables, and how to transform datasets using arrays, data step programming, and PROC TRANSPOSE. This paper is the fifth in the "Grad Student How-To" series and gives graduate students useful tools for working with longitudinal data.

The Greatest Hits: ODS Essentials Every User Should Know Cynthia Zender – SAS Institute Ever discover that there’s an option or destination or feature that has you singing its praises because the feature boosted your reports to the next level? This presentation covers some of the essential features and options of ODS that every user needs to know to be productive. This presentation will show you concrete code examples of the ODS “Greatest Hits”. Come to this presentation and learn some of the essential reasons why ODS and Base SAS rock! Incorporating DataFlux dfPower Studio 8.2 into a Graduate-Level Information Quality Tools Class Yinle Zhou * and John R. Talbur - University of Arkansas at Little Rock The University of Arkansas at Little Rock (UALR) currently offers the only graduate degree program in information quality in United States. SAS DataFlux is a founding sponsor of the program and also continues

to support the program through an academic license for its SAS DataFlux dfPower Studio product. The presentation describes how DataFlux dfPower Studio 8.2 has been incorporated into the Information Quality tools course in a way that not only helps students to understand and practice data quality techniques, but also gives students an introduction to data governance and master data management. The presentation also includes a description of how the laboratory exercises given in the course follow the DataFlux ”Five Steps to More Valuable Enterprise Data” methodology. It describes how each step is introduced first in class by a lecture from the course instructor, followed by an in-class software demonstration by the laboratory instructor. Students are given assignments to further develop their knowledge and to practice the techniques with the software. During the laboratory sessions, students become familiar with the basic operations and are able to build workflows to solve their assignments. An example is given where some students were even able to code a q-Gram Tetrahedral Ratio for approximate string matching and were able to add it to their workflow as a java plug-in. The presentation also discusses the “Data Challenge” team project that supplements the regular laboratory exercises, and how DataFlux dfPowerStudio is used by the teams to solve the data challenge in an iterative fashion. SAS® Information Studio - Map Your Way Through the Data Alejandro Farias - Texas Parks and Wildlife This reference document can serve as a summary instructional tool for SAS® Information Studio and is written to assist those responsible for providing access to data, such as an information architect, for data consumers. Topics covered include: • Selecting Tables • Table Relationships • Selecting Data Items • Organizing Data Items • Creating a Custom Category or Calculated Data Item • Single and Combination Filters • Prompts • Test Queries • Resource Replacement/Moving/Saving Information Maps In the simplest terms, SAS® Information Maps enables data consumers to access data. Information maps can be utilized by several SAS® products, including but not limited to Enterprise Guide, Add-In for Microsoft Office, Web-Report Studio and Information Delivery Portal. Data consumers are not required to know or even understand SQL or the structure of the underlying data source. An information architect can utilize predefined business logic or calculations, filters, and prompts to aid the data consumer in querying data. By simplifying the process of data accessibility, data consumers can focus on analyzing data output rather than spending time learning how to access, modify or select data for analysis. SAS-Implementations Supporting Satellite, Aircraft, and Drone-based Remote Sensing Endeavors and Their Influences in the Classroom Cecil R. Hallum - Sam Houston State University SAS has been a critically significant “partner” over a career spanning 40 years for this researcher. This presentation summarizes key SAS applications in satellite, aircraft and drone-based remote sensing endeavors (beginning in the early 70’s when SAS was first implemented at NASA/Johnson Space Center). The coverage includes recent multivariate strategies implemented in SAS geared toward finding missing bodies in digital imagery collected from drone flights as well as current research oriented toward improving the accuracies and speeds of such capabilities. Discussion of the impact of this research in the classroom for educational purposes at the university and high school levels is emphasized as well.

Tracking and Reporting Account Referral Activity using Hash tables and SAS BI James L. Beaver * and, Tobin Scroggins - Farm Bureau Bank One of the tasks of the analysis division at the bank is to keep track of new account referrals made by bank representatives based upon location, sales territory and manager. This is made more difficult because over time representatives may change their sales territory, no longer be active, report to different managers or more than one manager, or their reporting entities may change over time. The reporting requirements include being able to report on all activity based upon current sales territory and manager as well as to report on activity based upon sales territory and manager at the time the referral was made. Referrals by sales territory needed to be able to report all referrals in that territory over a period of time by all representatives as well as only those sales made by currently active representatives. To handle these requirements, a slowly changing dimension table was created as part of a data warehouse. This table is maintained using the SAS hash object to reduce processing time. Reports are produced using SAS BI tools including OLAP cubes and web report studio. This paper demonstrates the use of the SAS hash object to maintain the table and provides examples of reporting techniques. Exploring trends in topics via Text Mining SUGI/Global Forum proceedings abstracts Zubair Shaik * and Dr. Goutam Chakraborty - Oklahoma State University Many organizations across the world have already realized the benefits of text mining to derive valuable insights from unstructured data. While text mining has been mainly used for information retrieval and text categorization, in recent years text mining is also being used for discovering trends in textual data. Given a set of documents with a time stamp, text mining can be used to identify trends of different topics that exist in the text and how they change over time. We apply Text Mining using SAS® Text Miner 4.3 to discover trends in the usage of SAS tools in various industries via analyzing all 8,429 abstracts published in SUGI/SAS Global Forum from 1976 to 2011. Results of our analysis clearly show a varying trend in the representation of various industries in the conference proceedings from decade to decade. We also observed a significant difference in the association of key concepts related to statistics or modeling during the four decades. We show how %TMFILTER macro combined with PERL regular expressions can be used to extract required sections (such as abstract) of text from a large corpus of similar documents. Our approach can be followed to analyze papers published in any conference provided the conference papers are accessible in common formats such as .doc, .pdf, .txt, etc. The easy way to include XML formatted data into your SAS reports Mary Grace Crissey - Pearson Originally designed to meet the challenges of large-scale electronic publishing, XML is playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. Many heterogeneous information systems have chosen XML as their preferred method of data exchange - especially in pharmaceutical and medical environments, with usage growing in the financial and educational domains. With the stand alone utility tool , SAS XML Mapper 9.21, we can unlock the mystery of the XML “foreign” data to “see” the hidden metadata structure . I will show you how easy it is to install (and where to find the FREE GUI) in this short tutorial on how to make sense of the tags and Xpaths embedded as XML values. With XML Mapper , we can display your XML values visually in a hierarchical tree structure and produce schemas and syntax necessary to feed into the SAS libname engine. Examples from my educational testing assessment reporting project for the State of Wyoming will be presented. This talk presents a drag and drop way of exploring ALL your data - be they arriving as flat files, MS excel spreadsheets, mainframe files, SAS data, or eXtensible Markup Language.

Take a Fresh Look at SAS® Enterprise Guide®: From point-and-click ad hocs to robust enterprise solutions Chris Schacherer - Clinical Data Management Systems, LLC Early versions of SAS Enterprise Guide (EG) met with only lukewarm acceptance among many SAS programmers. As EG has matured, however, it has proven to be a powerful tool not only for end-users less familiar with SAS programming constructs, but also for experienced SAS programmers performing complex ad hoc analyses and building enterprise class solutions. Still, many experienced SAS programmers fail to add EG to their SAS toolkit. They face the barriers of an unfamiliar interface, new nomenclature, and uncertainty that the benefits of using EG outweigh the time spent mastering it. Especially for this group, (but also for analysts new to SAS), the present work attempts to orient new EG users to the interface and nomenclature while teaching them how to achieve common data management and analytic tasks they perform with ease in SAS. In addition, EG concepts and techniques that focus on using EG as a development environment for producing end-user analytic solutions are described.

Using a Free Microsoft tool to help manage your EBI SAS server Dan W. Strickland - Texas Parks and Wildlife It has been said that you get what you pay for. In this instance the software is free but the benefit is great. This paper will describe how Texas Parks and Wildlife uses the free utility, Microsoft Process Explorer to look into what processes are actually running on their BI SAS server. Is your server slow? Look to see what processes are currently using your processors. Track workspace server jobs and see their current status. You can even pause or kill their processing. You will also be able to link workspace jobs to the files on your work directory to keep the work directory free from unusable files. This paper will take you from download to configuration to usage of this free Microsoft software.

Beyond the Basics Win With SAS®, JMP®, and Special Interest User Groups Charles Edwin Shipp - JMP 2 Consulting, Inc. Kirk Paul Lafler - Software Intelligence Corporation Have you considered an in-house group for SAS®, JMP®, or special interests? We discuss how to start and maintain an in-house group. Benefits include leadership opportunity, peer-to-peer interaction, tutorials, collaboration of users and also departments, a focal point for proper requests, and getting to know other users and providers. This presentation discusses the differences in corporate cultures and provides examples of successful school, company, and government user groups. We then summarize "key" critical success factors. Becoming a Better Programmer with SAS Enterprise Guide 4.3 Michael Walters – SAS Institute Both existing and new users of SAS are turning to SAS Enterprise Guide to write and run their code. Long-time users are accustomed to typing all their code into the Program Editor window and simply hitting the Submit key. New users do not have this same set of expectations and are more willing to point and click on occasion. But the truth is becoming clear; the winning programmer will be the one who has the expertise to create the best of both worlds--either coding or clicking, depending upon which is more efficient for a given task. SAS Enterprise Guide 4.3 contains new functionality that can help anyone become a better programmer. These pages address the all-important question: when is it appropriate to code, and when to click? The aim here is to expose new users—as well as those familiar with SAS--to tips and best practices that will allow them to return to the office as better programmers.

The FILENAME Statement: Interacting with the World Outside of SAS® Chris Schacherer - Clinical Data Management Systems, LLC The FILENAME statement has a very simple purpose-to specify the fileref (or, file reference) that serves as the link to an external file or device. The statement itself does not process any data, specify the format or shape of a dataset, or directly produce output of any type, yet this simple statement is an invaluable SAS® construct that allows SAS programs to interact with the world outside of SAS. Through specification of the appropriate device type, the FILENAME statement allows SAS to symbolically refer to external disk files for the purpose of reading and writing data, interact with FTP Servers to read and write remote files, send e-mail messages, and gather data from external programs-including the local operating system and remote web services. The current work explores these uses of the FILENAME statement and provides examples of how you can use the different device types to perform a variety of data management tasks. An Introduction to SAS® Hash Programming Techniques Kirk Paul Lafler - Software Intelligence Corporation Beginning in Version 9, SAS software supports a DATA step programming technique known as hash that provides for faster table lookup, search, merge/join, and sort operations. This presentation introduces what a hash object is, how it works, and the syntax required. Essential programming techniques will be illustrated to sort data, search memory-resident data using a simple key to find a single value, as well as more complex programming techniques that use a composite key to search for multiple values. At Random - Sampling with Proc Surveyselect Patricia Hettinger - Independent Consultant Are you still sampling in this very common way? Read your source data, assign a random number, sort the data and then take every nth one? Proc surveyselect is a newer method for sampling data. This paper covers the basic features of this proc as well as a comparison with the aforementioned random number assignment. Top 10 Best Practices in Base SAS® Coding Sharu Shankar – SAS Institute With so many techniques to accomplish the same task in SAS, how do you choose between them? We'll look at some benchmarks run on common coding techniques to help you determine which technique to use when. You Can’t Get There From Here If You Don’t Know Where Here Is. Improving the SAS Enterprise Guide Data Characterization Task Patricia Hettinger - Independent Consultant SAS Enterprise Guide has many useful built-in tasks. The data characterization task gives some useful information but has several drawbacks. One is that it will run frequencies on all character data regardless of length or number of distinct values. This can result in some variables being dropped from the output due to too many distinct values. It also results in frequencies not being run for numeric values at all. Another is that minimum, maximum, number of missing values and number of non-missing values will be calculated only for numeric variables when this information would be useful for any variable. A third issue is the likelihood of your system hanging when attempting to analyze large data sets with many variables. This paper details how you can overcome these obstacles as well as incorporating your profile results into a useful mapping document.

They're Closing Down the Office in Kalamazoo and You’ve Been Tapped to Run Their In-house “Production” Reporting System, OH MY! David Cherry Welch - Citi The author's experience in running a home-grown reporting system for a major corporation is described as well as these features of the system: • Setting the SAS Environment Using autoexec.sas and Command Line Switches • Controlling Program Start Times with the SAS Data Step • Creating and Maintaining Report Distribution Lists Using SAS Formats • Modularizing Reports Using the SAS Macro Language and %include The benefits of this ad hoc system will be compared to other alternatives. The author will discuss the lessons learned from this assignment and suggestions for making projects easily supportable for others that may follow.

On Deck: SAS/STAT 9.3 Funda Gunes – SAS Institute SAS/STAT 9.3, coming soon to a site near you, delivers numerous enhancements to the statistical software. The PHREG procedure supports frailty models for incorporating random effects in Cox regression and the MCMC procedure provides a RANDOM statement to facilitate fitting Bayesian models with random effects. The NLIN procedure has been updated, and the MI procedure offers additional flexibility by providing a fully conditional specification method. The new SURVEYPHREG and HPMIXED procedures are also outfitted with additional capabilities. This talk reviews the highlights of the 9.2 and 9.22 releases of SAS/STAT software and then describes important 9.3 enhancements with practical illustrations. Mastering Non-Standard Data Sources with SAS: Using Patterns, Parsing, and Text to Handle Difficult Files Georgeanna N. Glezen - Independent Consultant

With the availability of many new data and file sources (web, transaction logs, free form reports) it challenges our ability to extract and process information that can be used by SAS. Learn how to identify patterns and utilize SAS programming to parse and capture non-standard data. We will walk through some examples and checklists on how to identify appropriate processing and learn how to successfully handle complex file layouts. Can’t Decide Whether to Use a DATA Step or PROC SQL? You Can Have It Both Ways with the SQL Function! Jeremy W. Poling - B&W Y-12 L.L.C. Have you ever thought that it would be nice if you could execute a PROC SQL SELECT statement from within a DATA step? Well, now you can! This paper describes how to create an SQL function using the FCMP procedure and the RUN_MACRO function. The SQL function accepts a SELECT statement as its only argument. By using the SQL function, you now have the ability to integrate the DATA step and the SQL procedure.

Finding Oracle® Table Metadata: When PROC CONTENTS Is Not Enough Jeremy W. Poling - B&W Y-12 L.L.C Complex Oracle® databases often contain hundreds of linked tables. For SAS/ACCESS® Interface to Oracle software users who are unfamiliar with a database, finding the data they need and extracting it efficiently can be a daunting task. For this reason, a tool that extracts and summarizes database metadata can be invaluable. This paper describes how Oracle table metadata can be extracted from the Oracle data dictionary using the SAS/ACCESS LIBNAME statement in conjunction with traditional SAS® DATA step programming techniques. For a given Oracle table, we discuss how to identify the number of observations and variables in the table, comments, variable names and attributes, constraints, indexes, and linked tables. A macro is presented which can be used to extract all the aforementioned metadata for an Oracle table and produce an HTML report. Once stored as an autocall macro, the macro can be used to quickly identify helpful information about an Oracle table that cannot be seen from the output of PROC CONTENTS. This paper assumes that the reader has a basic understanding of DATA step programming, the macro language, and the SAS/ACCESS LIBNAME statement. A basic understanding of relational database management system concepts is also helpful, but not required. Traffic Lighting: The Next Generation

J. VanBuskirk and J. Harper - Baylor Health Care System

Traffic lighting is a tool intended to let a reader quickly evaluate data and at-a-glance sort out good versus bad performance. As the name implies, traditional traffic lighting generally separates data into three categories highlighted with red, yellow, and green based on performance. However, if all the cells in a table are boldly colored in primary colors, the reader is unable to achieve their goal of easily sorting between good and bad results. In addition, colorblind users will not be able to even distinguish between colors. In this paper we will present the visual design changes made to our output and the SAS techniques behind them. We sought to reduce the amount of decoration in our tables and focus on what was important, the data! We employed a more subtle use of cell shading and borders to effectively draw the reader's eye to the results that most need their attention. Typically, traffic lighting is done using PROC FORMAT, however our data cells contained text strings which required a more complex use of style options, COMPUTE blocks, flag variables, and macros to implement. However all this is quite do-able in SAS with ODS and the results were a much more effective table delivered to our partners.

Foundations & Fundamentals The Way of the Semicolon or Three Things I Wish I Had Known Before I Ever Keyed One

Patricia Hettinger - Independent Consultant Learning SAS or teaching it to someone else can be very difficult. The author has found that understanding three main aspects of SAS very helpful. These are of course, the semicolon, the physical nature of a SAS data set and the two major data types. The intended audience is for those who are new to SAS or new to teaching it. Choosing the Road Less Traveled: Performing Similar Tasks with either SAS DATA Step Processing or with Base SAS® Procedures Kathryn McLawhorn – SAS Institute When you are seeking a solution that requires SAS code, do you choose the path that is marked with DATA step programming or do you prefer the path that applies Base SAS procedures? Programmers often tend to use just one path, based on their programming skills and personal preferences. In many situations, you can use either DATA step logic or procedure code to arrive at the same solution.

To help broaden your horizons, this paper provides examples of common situations in which using either method leads to the same result. While this paper compares coding methods, it does not promote one particular method nor examine differences in efficiency. Whether you are a loyal DATA step coder or a devoted proponent of procedure syntax, this paper offers some alternative techniques to help diversify your programming knowledge and skills. How to Read Data into SAS® with the DATA Step Toby Dunn Kirk Paul Lafler - Software Intelligence Corporation The SAS® System has all the tools users need to read data from a variety of external sources. This has been, perhaps, one of the most important and powerful features since its introduction in the mid-1970s. The cornerstone of this power begins with the INFILE and INPUT statements, the use of a single- and double-trailing @ sign, and the ability to read data using a predictable form or pattern. This paper will provide insights into the INFILE statement, the various styles of INPUT statements, and provide numerous examples of how data can be read into SAS with the DATA step. We will show how to use the features of the INFILE statement along with the inherent functionality of the DATA step to read not only well formed external files but also the extreme cases such as reading in all files in a directory and how to read data that is scattered over multiple lines. Ethnicity and Race: When Your Output Isn’t What You Expected

Philamer Atienza, MS - Alcon Laboratories, Inc. In SAS, when a classification variable is used to group observations with the same values and a formatted value is used for grouping data, unexpected results may come out of the procedure. If there is more than one unformatted value used for several distinct categorizations but with the same format label, SAS uses the unformatted lowest value to create the output. Understanding the behavior of SAS when storing the unformatted values will help avoid potential mistakes in using formats and nested classification variables. This paper examines two scenarios when a variable for both ethnicity and race is used in Proc Tabulate to create an output data set: (1) with and (2) without the use of a format. Looking Beneath the Surface of Sorting Andrew T. Kuligowski Many things that appear to be simple turn out to be a mask for various complexities. For example, as we all learned early in school, a simple drop of pond water reveals a complete and complex ecosystem when viewed under a microscope. A single snowflake contains a delicate crystalline pattern. Similarly, the decision to use data in a sorted order can conceal an unexpectedly involved series of processing and decisions. This presentation will examine multiple facets of the process of sorting data, starting with the most basic use of PROC SORT and progressing into options that can be used to extend its flexibility. It will progress to look at some potential uses of sorted data, and contrast them with alternatives that do not require sorted data. For example, we will compare the use of the BY statement vs. the CLASS statement in certain PROCs, as well as investigate alternatives to the MERGE statement to combine multiple datasets together.

The MEANS/SUMMARY Procedure: Doing More

Art Carpenter - CALOXY

The MEANS/SUMMARY procedure is a workhorse for most data analysts. It is used to create tables of summary statistics as well as complex summary data sets. The user has a great many options which can be used to customize what the procedure is to produce. Unfortunately most analysts rely on only a few of the simpler basic ways of setting up the PROC step, never realizing that a number of less commonly used options and statements exist that can greatly simplify the procedure code, the analysis steps, and the resulting output. This tutorial introduces a number of important and useful options and statements that can provide the analyst with much needed tools. Some of these tools are new, others have application beyond MEANS/SUMMARY, all have a practical utility. With this practical knowledge, you can greatly enhance the usability of the procedure and then you too will be doing more with MEANS/SUMMARY.

Basic SAS® PROCedures for Producing Quick Results Kirk Paul Lafler - Software Intelligence Corporation As IT professionals, saving time is critical. Delivering timely and quality looking reports and information to management, end users, and customers is essential. The SAS System provides numerous "canned" PROCedures for generating quick results to take care of these needs ... and more. Attendees acquire basic insights into the power and flexibility offered by SAS PROCedures using PRINT, FORMS, and SQL to produce detail output; FREQ, MEANS, and UNIVARIATE to summarize and create tabular and statistical output; and DATASETS to manage data libraries. Additional topics include techniques for informing the SAS System which data set to use as input to a procedure, how to subset data using a WHERE statement (or WHERE= data set option), and how to perform BY-group processing. Unveiling the Power of Cascading Style Sheets (CSS) in ODS Kevin Smith – SAS Institute Prior to SAS 9.2, Proc Template was the only choice for writing ODS styles. SAS 9.2 added the capability of writing styles using Cascading Style Sheets (CSS) syntax; however, the accepted CSS syntax was primarily a remapping of Proc Template elements and attributes to CSS classes and properties. While this remapped syntax was a step in the right direction, no new capabilities were added to the styles themselves. CSS support in SAS 9.3 takes a bigger leap towards the power of CSS seen on the Internet today. While most people think of CSS as an HTML-only technology, it is merely a style syntax that can also be applied to other ODS outputs such as PDF, RTF, and ExcelXP. In fact, CSS in SAS 9.3 will work with any of the ODS outputs that use styles. This paper will uncover the hidden power of CSS in SAS 9.3 and show you things that were never before possible in previous versions of SAS as well as directions it will be taking in the future. Understanding the Anatomy of a SAS Deployment -- What's in My Server Soup? Donna Bennett – SAS Institute Do you ever get confused about the pieces of a SAS metadata-based deployment and where they go and what they do? This session will highlight the major components of a metadata-based SAS deployment of solutions and BI and give you an overview of how SAS is standardizing our development and deployment processes to continue to improve integration between SAS software offerings and compatibility between releases.

PROC TABULATE: Getting Started Art Carpenter - CALOXY

Although PROC TABULATE has been a part of Base SAS® for a very long time, this powerful analytical and reporting procedure is very under-utilized. TABULATE is different; it's step statement structure is unlike any other procedure. Because the programmer who wishes to learn the procedure must essentially learn a new programming language, one with radically different statement structure than elsewhere within SAS, many do not make the effort. The basic statements will be introduced, and more importantly the introduction will provide a strategy for learning the statement structure. The statement structure relies on building blocks that can be identified and learned individually and in concert with others. Learn how these building blocks form the structure of the statement, how they fit together, and how they are used to design and create the final report.

Type Less, Do More: Have SAS do the typing for you Jeanina Worden - PPD Typing less when you're a SAS® programmer seems counterintuitive; however when repetitive tasks leave you with the realization only five words differentiate the last twenty lines of code, the concept becomes more clear. There are numerous ways to accomplish these tasks, such as “hardcoding” and copy-and-paste, however they carry with them increased risk in terms of additional time required for updates, and the lack of assurance all constraints are accounted for. Therefore the most common “go to” solution is the macro; however that too can quickly result in a hand cramping amount of code. This paper shows how CALL EXECUTE can instead be used to dynamically code repetitive tasks, populating the required constraints from the actual metadata ensuring all available constraints are accounted for and reducing the need for updates if the database changes, and doing it all with less coding. For simplicity purposes PROC PRINT is used in the examples however the code can be changed to perform any function where the repeating code differs by a single dataset or variable name. Protecting Macros and Macro Variables: It Is All About Control Eric Sun and Art Carpenter - CALOXY In a regulated environment it is crucially important that we are able to control, which version of a macro, and which version of a macro variable, is being used at any given time. As crucial as this is, in the SAS® macro language this is not something that is easily accomplished. We can write an application that calls a macro that we have validated, but the user of the application can write and force the use their user written un-validated version of that same macro. Values of macro variables that we have populated can be “accidentally” replaced by user written ssignments. How can you guarantee that the end results are accurate if you cannot guarantee that the proper programs have been executed? Although our tools are limited, there are a few options available that can be used to help us control our macro execution environment. These, along with management programs, can give the application developer better control, and greater protection, during the execution of the application. For a successful macro language application, it is all about CONTROL!!

Potpourri What’s Hot, What’s Not - Skills for SAS® Professionals Kirk Paul Lafler -Software Intelligence Corporation Charles Edwin Shipp - JMP 2 Consulting, Inc. As a new generation of SAS® user emerges, current and prior generations of users have an extensive array of procedures, programming tools, approaches and techniques to choose from. This presentation identifies and explores the areas that are hot and not-so-hot in the world of the professional SAS user. Topics include Enterprise Guide, PROC SQL, PROC REPORT, PROC FORMAT, Macro Language, ODS, DATA step programming techniques such as arrays and hashing, sasCommunity.org®, LexJansen.com, JMP®, SAS/GRAPH®, SAS/STAT®, and SAS/AF®. You Could Be a SAS® Nerd If You . . . Kirk Paul Lafler - Software Intelligence Corporation Are you a SAS® nerd? The Wiktionary (a wiki-based Open Content dictionary) definition of "nerd" is a person who has good technical or scientific skills, but is generally introspective or introverted. Another definition is a person who is intelligent but socially and physically awkward. Obviously there are many other definitions for "nerd", many of which are associated with derogatory terms or stereotypes. This presentation intentionally focuses not on the negative descriptions, but on the positive aspects and traits many SAS users possess. So let's see how nerdy you actually are using the mostly unscientific, but fun, "Nerd" detector. Using SAS Help to Validate Attributes Sadia Abdullah SASHELP library contains a group of catalogs and files containing metadata information used to control various aspects of a SAS session. Using this valuable information in SAS programming can lead to robust and dynamic code. This presentation lists the different views in SASHELP library and describes what kind of information each one of these views hold. As an example of how the information from SASHELP can be used in day to day programming this presentation will show how SASHELP metadata can be used to validate attributes of an SDTM dataset against its specifications. The Top 10 Head Scratchers: SAS® Log Messages That Prompt a Call to SAS Technical Support Kim Wilson – SAS Institute As a DATA step programmer, you know the sigh of relief that comes when your job finishes and your SAS log is clear of any errors or warnings. When these messages do occur, most of the time they are intuitive enough so that you can move directly to the offending code, make the necessary changes, and complete the job successfully. However, what do you do about those perplexing messages that sometimes appear, making you scratch your head in puzzlement? This paper examines the top 10 notes, errors, or warnings that prompt DATA step programmers to call SAS Technical Support. The discussion addresses the common causes of the messages and provides solutions so that you can reduce your troubleshooting time and effort. SAS for Performance: An introductory Tutorial Michael Welch – SAS Institute IT Professionals need to understand and monitor how well their systems are performing. Whether you are monitoring a single system or server in real time or managing the demands of thousands of virtual servers over much longer time periods, the need to understand the requirements, limitations, and opportuties is critical. In this session,

both the performance of SAS on a single machine will be reviewed as well as how SAS solutions can manage the performance of all of your IT systems. CDISC ADaM Application: One-Record-Per-Subject Data That Doesn't Belong in ADSL Sandra Minjoe - Octagon Research Solutions, Inc. It can be tempting to push a lot of analysis data into ADSL because of its simple and convenient one-record-per-subject structure. However, ADSL was designed to hold only information used in other analysis datasets, such as population flags, treatment variables, and basic demographics. So where should all the other one-record-per-subject information, such as date of disease progression or total amount of study drug received, go? This paper and presentation will show examples, weigh the pros/cons of different dataset structure options, and help attendees answer this question for their own data. Staying Relevant in the Ever-Changing Pharmaceutical Industry Aiming Yang* and Robert Hoffman - Merck & Inc

In recent years, the pharmaceutical industry has undergone dramatic changes. For SAS programming professionals, these challenges include how to adapt to these changes, remaining effective and thus staying relevant in the industry. In this paper the authors share some experiences gained in some large pharmaceutical companies. The major thoughts shared in this paper include the following: solid, up-to-date, diverse SAS programming skills and a good understanding of statistics and clinical trials are required skills. In this changing environment, these skills still matter since it is how we are defined by the industry. Additionally, being a sensible, good team member is essential for fulfilling our roles and functionality. Finally, the ability to work effectively with cross-functional department personnel and external vendors is a must for experienced programming analysts. These abilities will help us stay relevant amidst the ever-changing processes and trends, and thus define who will stay and thrive within this industry. A Well Designed Process and QC Tool for Integrated Summary of Safety Reports H. Chen - Merck Sharp & Dohme Corp., Rahway, NJ The ISS (Integrated Summary of Safety) is a critical component in submissions for drug approvals in the pharmaceutical industry. This report consists of multiple reports from clinical studies that focus on drug safety and are generally programmed in SAS. ISS uses the relevant data from one or more clinical studies to generate the tables and figures from integrated data. Various methods are used to verify the results found on the ISS reports. One method focuses on whether the ISS analysis results from the integrated data are consistent with the results from each of the individual studies. This paper introduces a well-designed process and validation tool to ensure the output consistency and integrity of the ISS reports with the individual underlying studies. Reading and Processing Mystery Data Sets

Jimmy DeFoor - Diversant What are the best coding methods for comparing unknown files with the same layouts? Answer: coding methods which automatically adjust for the number of fields and for different field formats, while also performing the same comparisons regardless of those field formats. This paper discusses one method of comparing unknown files with the same layouts. It uses SAS macros variables, arrays, and the vcolumn view to efficiently process three Credit Bureau files without knowing the names of those variables. The technique uses the Call SYMPUT function to create SAS macro variables from the name and type fields in the vcolumn view. Then, it uses a SAS macro to retrieve those macro variables and load them into length statements and SAS arrays. Furthermore, the SAS macro creates new variables that use the old variables as the root for the new variables names, such as Attr46 being used to create Attr46t_tot and

Attr46_ck, and then assigns those variables to other SAS arrays in the same relative position as the original variable. This allows the new field to be updated in a do loop when the original field is being investigated by that loop. The SAS techniques used in this paper include macro variable double resolution, macro variable concatenation, SAS variable concatenation, SAS array processing, user formats, Call SYMPUT and the dictionary content retrieved from the vcolumn view. The program created by the macros reads the a consolidated bureau file built from the three bureau files, evaluates all character variables from three bureaus for their similarity in content, evaluates all numeric variables for their similarity in content, and then sums the findings for each field. Don’t Gamble with Your Output: How to Use Microsoft Formats with ODS Cynthia Zender – SAS Institute Are you frustrated when Excel does not use your SAS formats for number cells? Do you lose leading zeroes on zip codes or ID numbers? Does your character variable turn into a number in Excel? Don’t gamble with your output! Learn how to use the HTMLSTYLE and TAGATTR style attributes to send Microsoft formats from SAS to Excel. This paper provides an overview of how you can use the HTMLSTYLE attribute with HTML-based destinations and the TAGATTR attribute with the TAGSETS.EXCELXP destination to send Microsoft formats from SAS to Excel using ODS STYLE= overrides. Learn how to figure out what Microsoft format to use and how to apply the format appropriately with ODS. A job aid will be included in the paper that lists some of the most common Microsoft formats used for numeric data. The examples in this paper will demonstrate PROC PRINT, PROC REPORT and PROC TABULATE coding techniques. Top Ten SAS® Sites for Programmers: A Review Kirk Paul Lafler -Software Intelligence Corporation Charles Edwin Shipp - JMP 2 Consulting, Inc. We review the top ten SAS® sites for coders, beginning with sas.com and jmp.com. We then expand to sasCommunity.org, support.sas.com, and six other popular sites that assist you in training and programming. If you use Google to search for SAS Web sites, you will get over a million hits. In this paper, we present the results from an unscientific, but interesting, survey and analysis we conducted about the SAS sites visited by those who answered our survey. From nearly 400 invited to respond, more than 60 SAS users shared their insights, along with comments, for 65 SAS-related websites. Finally, we narrow the list down to ten. Connect with SAS® Professionals Around the World with LinkedIn and sasCommunity.org Charles Edwin Shipp - JMP 2 Consulting Kirk Paul Lafler - Software Intelligence Corporation Accelerate your career and professional development with LinkedIn and sasCommunity.org. Establish and manage a professional network of trusted contacts, colleagues and experts. These exciting social networking and collaborative online communities enable users to connect with millions of SAS users worldwide, anytime and anywhere. This presentation explores how to create a LinkedIn profile and social networking content, develop a professional network of friends and colleagues, join special-interest groups, access a Wiki-based web site where anyone can add or change content on any page on the web site, share biographical information between both communities using a built-in widget, exchange ideas in Bloggers Corner, view scheduled and unscheduled events, use a built-in search facility to search for desired wiki-content, collaborate on projects and file sharing, read and respond to specific forum topics, and more.

Consulting: Critical Success Factors Kirk Paul Lafler - Software Intelligence Corporation Charles Edwin Shipp - JMP 2 Consulting, Inc. The Internet age has changed the way many companies, and individuals, do business - as well as the type of consultant that is needed. The consultants of today and tomorrow will require different skills than the consultants of yesterday. Today's consultant may just as likely have graduated with an MBA degree as with a technical degree. As hired advisers to a company, a consultant often tackles a wide variety of business and technical problems and provides solutions for their clients. In many cases a consultant chooses this path as an attractive career alternative after toiling in industry, government and/or academia for a number of years. This presentation describes the consulting industry from the perspective of the different types of organizations (e.g., elite, Big Five accounting firms, boutique, IT, and independent) that they comprise. Specific attention will be given to the critical success factors needed by today's and tomorrow's consultant. Assigning a User-defined Macro to a Function Key

Mary Rosenbloom - Edwards Lifesciences, LLC Kirk Paul Lafler - Software Intelligence Corporation Are you entering one or more of the same SAS Display Manager System (DMS) commands repeatedly during a session? The DMS offers a convenient way of capturing and saving frequently entered commands in a user-defined macro, and then saving the macro as a function key of your choosing. This paper illustrates the purpose and steps one would use to assign a user-defined macro to a function key. Benefits of sasCommunity.org for JMP® Coders Charles Edwin Shipp - JMP 2 Consulting Kirk Paul Lafler - Software Intelligence Corporation The benefits of sasCommunity.org to SAS® users are available to JMP® users also, but participation has lagged. This is partly due to excellent JMP websites including their discussion groups. Reasons for increased JMP participation on sasCommunity.org are illustrated and discussed, including the interchange of data, statistical and graphics, between SAS and JMP software. The benefit of a community to have your work known and also to help newer users will become increasingly more important as JMP users support sasCommunity.org as its popularity grows. Best Practices - Clean House to Avoid Hangovers

Mary Rosenbloom - Edwards Lifesciences, LLC Kirk Paul Lafler - Software Intelligence Corporation In a production environment, where dozens of programs are run in sequence, often monthly or quarterly, and where logs can span thousands of lines, it’s easy to overlook the small stuff. Maybe a data statement fails to execute, but one already exists in the temp library from a previous program. Maybe a global macro assignment is missed or fails to execute, but a global macro of the same name already exists from a previous program. This can also happen with macros. The list goes on. This paper offers some suggestions for housekeeping steps that can be taken at the end of each SAS program to minimize the chance of a hangover.

Information Reporting & Statistics An Introductory Tutorial on Mixed Models Funda Gunes – SAS Institute Mixed models analysis is one of the cornerstones of modern statistics. It extends the general linear model for independent and equivariant data by allowing a more flexible covariance for the error term. Using mixed models, you can fit models to a variety of data that follow the normal distribution, including repeated measurements and those from

a randomized block design. This tutorial introduces the basics of mixed models methodology and shows how to analyze linear mixed models with SAS’s flagship procedure for mixed modeling, the MIXED procedure. Numerous examples are used to illustrate typical applications of the MIXED procedure. This tutorial also includes an overview of other mixed modeling procedures in SAS, giving a brief introduction to analyzing generalized linear models with the GLIMMIX procedure and discussing the scenarios in which you would use the HPMIXED and NLMIXED procedures. Prerequisites are a working knowledge of the general linear model and some basic matrix algebra. Statistical comparison of relative proportions of bacterial genetic sequences Jose F. Garcia-Mazcorro, Jan S. Suchodolski, Joerg M. Steiner, and Bradley J. Barney - - Texas A&M University The intestinal tract is inhabited by hundreds of different types of bacteria, which have the potential of enhancing health or disease in the host. Several current technologies are capable of identifying these bacteria by determining the order of nucleotides (sequencing) in their DNA sequence with an unprecedented coverage. These technologies can provide two types of data sets: 1) the raw genetic sequences (not discussed here), and 2) the relative proportions of sequences, which are calculated by dividing the number of sequences obtained from a given bacterial group by the total number of sequences obtained. This dependent variable (relative proportions of sequences) is continuous but constrained between 0 and 100%, and has a nested architecture (bacterial species within a genus within a Family within an Order within a Class within a Phylum). I discuss different alternatives (both parametric and non-parametric) to analyze this data set, with emphasis on the use of SAS 9.2. PROC MIXED can be used but skewed residuals are commonly encountered (data is usually not normally distributed). PROC GLIMMIX with a beta distribution can also be used; however, the beta distribution assumes that the total proportion of 100 is divided between two groups. The Dirichlet distribution is a generalization of the beta distribution that allows a proportion to be divided between two or more groups, but SAS does not currently provide this option. Future analyzes are needed and ongoing to empirically determine the most appropriate statistical method to compare relative proportions of bacterial genetic sequences. Prediction of Diabetes Repalli Pardha Saradhi - Oklahoma State University The main purpose of this paper is to forecast how likely the people with different age groups(young age, middle age, older age) may be affected by diabetes based on their daily activities and food habits. To predict whether the individual is affected with diabetes or not. If the individual is diabetic then what are the different factors affecting three different segments. The statistical technique used in this paper is Segmentation and Cluster analysis. The main goal of this presentation is to prepare a customized list of food items to eat so that it would be useful in avoiding diabetes.

Kass adjustments in decision trees on Binary vs. continuous Immadi, Manoj Kumar - Oklahoma State University The paper will explain how the split search algorithm works and how the Kass adjustment will be made in order to maximize the independence between the two branches after the split. It is observed that Kass adjustments will always improve the independence between the two branches but there has been no proper evidence that how Kass adjustments will work in case of binary vs. continuous target variable. After explaining how Kass adjustments will be made my goal is to compare the Kass adjustments advantages on interval target variable and continuous target variable.

Use of Decision, Cut-off and SAS code node in SAS Enterprise Miner while scoring to adjust prior probabilities and prediction cutoff for separate sampling Yogen Shah - Oklahoma State University It is common practice to use sample whose primary outcome proportion is different than that of actual proportion in the population, while building predictive models for binary target variable. This kind of separate sampling or balanced sampling works effectively when ration of primary outcome to secondary outcome is very small. Building predictive model from such balanced sample gives various advantages like reduced bias to particular sample outcome case and improved performance. Model fit statistics & assessment analysis plots are very much related to outcome proportion in training & validation dataset. Therefore resulting model cannot predict well while scoring the score data set because outcome primary proportion in scoring data set is similar to the population but different than balanced sample. This presentation illustrates the effective use of Decision, Cut-off and SAS code node in SAS EM to resolve above problem. Decision node specifies actual proportion in population aka prior probabilities while drawing sample dataset for model building. SAS, by default uses cut-off value of 0.5 while predicting binary outcome from predicted probabilities which means that chance of primary outcome is same as secondary outcome. But this is not true from the fact that proportion of primary outcome in population is very small. SAS provides cut-off node to adjust this cut-off value based on model’s ability to predict true positive, false positive & true negative. We need to add specific code under “score” section of SAS code node to account for the cut-off value change in scoring dataset as well. A SAS Macro Tool for Selecting Differentially Expressed Genes in Microarray Data Huanying Qin *, Laia Alsina, Hui Xu, Elisa Priest - Baylor Health Care System DNA Microarrays measure the expression of thousands of genes simultaneously. Commercial software such as JMP®/Genomics and GeneSpring® use T-tests, ANOVA, or mixed models for statistical analysis to identify differentially expressed genes. These methods are valid for larger sample sizes. We work with an immunology laboratory that often needs to analyze data from experiments with less than 10 samples. The researchers developed an Excel-based process to select differentially expressed genes from microarray experiments with a small sample size. This process required complex, manual manipulation of data and could take weeks to complete. We created a SAS MACRO to automate the whole process. The program reads microarray data from and provides a summary report in Excel. The researchers can easily modify the parameters and repeat the analysis. The program made it possible to reduce data processing time from weeks to minutes with no mistakes related to manual manipulation. In addition, it provides more output information for further analysis. This paper describes the tool and uses real data to demonstrate that it is valid and efficient. Evaluation of Promotional Campaigns in the Utilities Industry Using a Transfer Function Time-Series Model Fujiang Wen - City of Dallas Water utilities Promotional campaigns are often used by the utilities industry to increase total sales level of their products or services. Evaluation of the effectiveness of campaigns is a key issue for the utilities to effectively use their resources because the campaigns normally require expenses. A transfer function for the time-series model is applied for an analysis of the direct bill insert advertising campaign to promote a new toilet replacement program for Dallas Water Utilities customers. The analysis was based on the numbers of customers who participated in the program from August, 2007 to September, 2010. A point intervention function was used to indicate three times of the advertising campaigns, with the model to quantify the promotional effect. As a result, an exponential transfer function was identified to describe the effect. The study shows that, after a promotional campaign, the numbers of participating customers are significantly increased, and then quickly shrinks with an exponential decreasing trend. The findings can be used to forecast the future demand with proposed promotional campaigns.

Proc Surveyfreq: Why Do a Three Way Table in SAS When We Want Two Way Table Information? Hemalkumar B. Mehta* and, Michael L Johnson - Department of Clinical Sciences and Administration, College of Pharmacy, University of Houston A Proc Surveyfreq procedure in SAS® has an advantage over Proc Freq in that it incorporates multi-stage probability sampling design into the analyses. Several nationally representative data have multi-stage probability sampling design. Most of the time we need two way table information for the group of our interest, eg: patients with certain disease. There are two ways to get group specific results in Proc Surveyfreq: (i) use “by statement” (ii) do “three way tabulation”. “By statement” will provide group specific results but it will not give valid domain analysis and it will not preserve the sampling design. Hence, the results will not be generalizable to the population level. “Three way tables” will provide group specific results with valid domain analysis while preserving the sampling design. In the current paper, using Medical Expenditure Panel Survey (MEPS) data, we show that three way tables should be used when we need two way table information primarily for valid domain analysis and extrapolating results at population level. This paper can serve as a guide to researchers who deal with single stage or multi-stage probability survey data which uses clustering, stratification and weighting.

Advice to Health Services Researchers: Be Cautious Using the “Where” Statement in SAS Programs for Nationally Representative Complex Survey Data Hemalkumar B. Mehta* and, Michael L Johnson - Department of Clinical Sciences and Administration, College of Pharmacy, University of Houston Health services researchers often conduct research with nationally representative survey data where participants or patients are not sampled randomly but sampled using complex stratified multistage probability designs. Such datasets include cluster, strata and weight information which are essential for extrapolation of results to a national level. Several Proc Survey procedures are available in SAS® 9.2 which enables analysis of such data while preserving the complex sampling design and extrapolation of results. The first step researchers often perform is selection of a population of interest, i.e. selection of participants with certain inclusion criteria, from the main dataset. This can be accomplished in SAS® using the “where” statement in data steps. However, use of the where statement for selecting a population of interest can defeat the purpose of the sampling design of such data and limits researcher’s ability to generalize results. In the current paper using Medical Expenditure Panel Survey (MEPS) data, a nationally representative multistage probability survey data, we show how to analyze such data while preserving sampling design and not using the where statement. The principles and techniques explained in this paper can be extended to any other disciplines where the researcher has to deal with complex survey data which involves cluster, strata and weight information in sampling design of the data.

An Introductory Look at the Situational Context Inherent in the RBI Using Two Modeling Approaches Ryan Sides - Baylor University The RBI (run batted in) is a popular statistic in Major League Baseball that is extremely dependent on the situational context (i.e., which bases are occupied by runners along with the number of outs in an inning) experienced by the hitter. This presentation offers insight into how much this situational context affects the RBI, providing two related modeling approaches that account for this information and, thus, an approach for improving a player’s evaluation. The first model used to accomplish this goal is a standard multiple regression model, while the second is an intuitive approach based on years of experience as a player by the author. Various statistical tools are utilized to check assumptions and to compare the models; further, the resulting statistics are compared to those frequently used in baseball. A discussion of the use of SAS to do this modeling and analysis along with a demonstration of the developed GUI is included.

When It's Not Random Chance: Creating Propensity Scores Using SAS EG Josie Brunner - Austin Independent School District While randomized samples are ideal for hypothesis testing, they are not always possible, especially when evaluating programs in which participants select themselves into the treatment or control group. One quasi-experimental design approach is to use propensity scores to match treatment and control units to reduce selection bias in observable pre-treatment characteristics. This presentation will focus on why and when propensity score analysis (PSA) should be included in a research design and will demonstrate how a propensity score can be created very simply using SAS EG 4.3. A Simulation Study for Power Analysis in a Longitudinal Study Using SAS Fenghsiu Su, MSBA, Ravi Lingineni, Philamer Atienza, MS, Subash Aryal, PhD, Karan P. Singh, PhD, Sejong Bae, PhD - University of North Texas Health Science Center Background: In a longitudinal study, it is unlikely that every subject’s information can be obtained at each time point. To study the incomplete data across time and the influence of subjects on their repeated observations as a random effect, mixed-effect regression models (MRMs) can be used. The purpose of this research is to study the power characteristics of the likelihood ratio test for hierarchical correlated data. Methods: We conducted a simulation study based upon 10,000 replicates of data. The MRM is constructed for different sample sizes in 3 situations: (1) with five time points as a fixed factor and the intercept and trend as random variables, (2) with various time points and fixed variances for large and small sample sizes, and (3) for correlation between time points. By using the likelihood ratio test, we determined an appropriate model to estimate the parameters. Data were based on a random normal distribution with mean 0 and a pre-specified error variance. To simulate realistic missing data, we assumed a 20% drop out rate. Results: Fixing factors constant (other than the parameter of interest) in the scenarios above, we observed that the power increases: (1) as the number of time points increases, (2) as the sample size increases, (3) as the variance decreases, and (4) as the correlation between the time points decreases. All results were consistent with previous studies on statistical power characteristics.

Using Proc Logistic, SAS Macros and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry Alexandros Vidras* and David Tysinger - Merkle Inc Predictive models are used extensively in customer relationship management analytics and data mining to increase the effectiveness of marketing campaigns. Logistic regression remains at the forefront in analytics as the most popular technique used to predict customer behavior. Particularly with direct mail marketing, logistic regression models are built using previous campaigns that span several months in length, posing a major challenge to statisticians to devise a way to not only capture seasonality across these campaigns but to also evaluate the stability of these models. Millions of dollars are spent annually on marketing activities that utilize logistic regression models. Therefore the predictive ability and robustness of logistic models is essential for executing a successful direct mail campaign. This paper shows how Proc Logistic, ODS Output and SAS macros can be used to proactively identify structures in the input data that may affect the stability of logistic regression models and allow for well-informed preemptive adjustments when necessary. Thus we are introducing a standardized process that industry analysts can use to formally evaluate the impact and statistical significance for predictors within logistic regression models across multiple campaigns and forecasting cycles.

“Let’s Get SAS® To Do It!” Getting Data from SUDAAN® to SAS® to

EXCEL®

Anna Vincent - Texas Department of State Health Services (DSHS)

This paper will show that a person can be new to SAS and create beneficial reports. Armed with the SAS Basic Programming Essentials Training Manual and using the ODS and ExcelXP Tagset, data results will be moved from SUDAAN to SAS and finally into EXCEL without touching the data, using the names of the risk factor variables, or how many values those variables had. The jumbled SUDAAN output will be transformed into something that is informative and easy to read. And when my supervisor wanted charts with confidence intervals, the simple solution was incorporating a chart template into the syntax!