2006 Knoware

68
The road to Knoware: 25 years of SAS Clare Somerville Mark Houliston Colin Harris Brian Bee

Transcript of 2006 Knoware

Page 1: 2006 Knoware

The road to Knoware: 25 years of SAS

Clare Somerville

Mark Houliston

Colin Harris

Brian Bee

Page 2: 2006 Knoware

Knoware• 5 directors (+ other SAS consultants)

• 4 SAS technologists– Brian Bee– Clare Somerville– Colin Harris– Mark Houlison– 91 years of SAS experience

• 3 ex-SAS employees– 46 years at SAS Institute

Page 3: 2006 Knoware

Programme

• Time line– 1980s– 1990s– 2000s

• Projects• Trends• Data Warehousing• Business Intelligence• Issues• Tips and techniques• Questions and prizes

Page 4: 2006 Knoware

Tips and Questions

Page 5: 2006 Knoware

Did you know…

To print the full day-of-week or month name, use a length of 9 on the WEEKDATE or WORDDATE format

e.g.

data _null_; D=’08may2006’; put d weekdate9.; will yield Monday put d worddate9.; will yield Mayrun;

Page 6: 2006 Knoware

What is today’s date in SAS format?

Page 7: 2006 Knoware

And the correct answer is…..

17113

Page 8: 2006 Knoware

1970s

Page 9: 2006 Knoware

1970s and Before

1943: I think there is a world market for maybe five computers. (?Thomas Watson of IBM)

1968: What the hell is [a microprocessor] good for? (Robert Lloyd of IBM's Advanced Computing Systems Division)

1977: There is no reason for any individual to have a computer in his home. (Ken Olson of Digital Equipment)

Page 10: 2006 Knoware

1970s - IT

• Punched cards!

• Mainframes

• Minicomputers

• Introduction of Microcomputers– Apple, Commodore

Page 11: 2006 Knoware

1970s - SAS

• 1972 – SAS software developed!

• 1976 – SAS Institute formed– Only 1 product – SAS– 7 employees

• 1979 – First license outside US– Databank in NZ

Page 12: 2006 Knoware

Tips and Questions

Page 13: 2006 Knoware

Did you know…

You can create your own date formats using the picture statement and format directives

Page 14: 2006 Knoware

e.g.

proc format;picture mydatehigh-low = ‘%i:%m:%s:%y%b%d’ datatype=datetime);

run;data _null_;d=’08nov2006:09:09:34’dt;put d mydate.;

run;

will print the date/time as 9:9:34:2006nov8

Page 15: 2006 Knoware

When was the Gregorian calendar system first

introduced?

Page 16: 2006 Knoware

And the correct answer is…..

1582 (GB and USA in 1752)

Page 17: 2006 Knoware

1980s

Page 18: 2006 Knoware

1980s1981: 640K ought to be enough for

anybody. (Bill Gates)– 1996 I've said some stupid things and some

wrong things, but not that. (Bill Gates)

1984: The Macintosh uses an experimental pointing device called a mouse. There is no evidence that people want to use these things. (John Dvorak)

1988: I believe OS/2 is destined to be the most important operating system, and possibly program, of all times. (Bill Gates)

Page 19: 2006 Knoware

1980s - IT

• Mainframes were king– Centralised computing– Good control, security, backup, performance– But truckloads of $

• Government Computer Services (GCS)– CCC (Cumberland), VCC (Vogel)

• “Minis” start to feature– Digital VAX, Data General, Prime

• IBM PC arrives!

Page 20: 2006 Knoware

1980s - IT

• No PCs – until later on• No Internet• No email• Pen plotters• Spreadsheets arrive

– Visicalc -> Lotus 123 -> Excel– Even SAS/Calc!

• OS/2 arrives in 1987• LANs start to appear• Acoustic couplers for comms

Page 21: 2006 Knoware

1980s - SAS• 1980

– New SAS products – SAS/Graph and SAS/ETS• 1981

– Brian joins NZ SAS distributor – 39 staff world-wide– 3 in NZ– 1 SAS manual 2.5 cm thick

• 1982– Full SAS office– First SUNZ conference - 50 attendees– Proc GSLIDE for presentations - coding– First SAS course

• Cut-and-paste text and pictures on OHP

Page 22: 2006 Knoware

1980s - SAS

• NZ office first in Asia Pacific – second in world• Called for people to start

SAS Australia, Singapore and Hong Kong• 1983

– Colin joins SAS NZ– Mark joins SAS Singapore– Communications to US is telex

• Setinits – translate ; to =

– First non mainframe version – “mini” SAS on VAX• Weird new concept of windows – to become Display

Manager

Page 23: 2006 Knoware

1980s - SAS

• 1985– SAS Version 5 was a big deal– PC SAS – wow– Multi-coloured windows

• Purple, blue, cyan• Well before MS-Windows!

• Proc format• Weird new Macro concept• End 80’s – SAS manual now 5 cm thick

Page 24: 2006 Knoware

1980s - BI• Large OLTP systems

– Lots of users– Lots of small requests– Large database of current data

• Not suited to reporting

• Long delays for reports and systems

• Belief - not to duplicate data

• Reporting off source systems

• Batch windows

Page 25: 2006 Knoware

1980s - BI

• Information Dictatorship

• Senior management only

• Executive Information Systems

• Decision Support Systems

Page 26: 2006 Knoware

Project: Treasury Reporting 1982

• Novel approach!

• Typically provide reports to Departments

• Instead, make data and SAS available

• DIY reporting for Departments

• All Departments could access CCC

Page 27: 2006 Knoware

Tips and Questions

Page 28: 2006 Knoware

Did you know…

If you want to control orientation of your column headings in PROC PRINT, there is a heading= options

e.g.

proc print data=thisfile heading=vertical;

Page 29: 2006 Knoware

How many characters long can a variable label be?

Page 30: 2006 Knoware

And the correct answer is…..

256

Page 31: 2006 Knoware

1990s

Page 32: 2006 Knoware

1990s

1994: I see little commercial potential for the Internet for at least ten years. (Bill Gates)

Page 33: 2006 Knoware

1990s - IT

• More distributed– Lots offloaded to minis – PCs become practical– UNIX starts to grow– Backup? Performance? Management?

• Windows 3.1– Available in 1992

• Web applications

Page 34: 2006 Knoware

1990s - SAS

• SAS decides OS/2 the way to go – Windows won’t last

• 1990: Client-server with SAS/Connect

• 1993: SAS/EIS

• 1995: Cubes, MDDBs

Page 35: 2006 Knoware

1990s - SAS

• Continues to grow• SAS UK

– 22 staff to 250 in 10 years

• Formal data warehouse projects– RAF LITs - £600 million!

• SAS Manuals– Proc 1&2, Lang, Stats 1&2, Ref– 30 cm +– 12 boxes floppy disks

Page 36: 2006 Knoware

1990s - BI

• Turn around from the 80s

• Dynamic business environment

• Growing importance of information to support business

• Timely accurate information

• Rise of the Spreadsh*t

Page 37: 2006 Knoware

1990s - BI• Data warehouses (Exploration)

– Corporate needs, effort– Centrally owned– Transaction level data, history

• Data marts– Smaller, quicker, cheaper– Redundancy– Reconciliation– Consistency– Anarchy

Page 38: 2006 Knoware

“You can catch all the minnows in the ocean and stack them together but they still do not make a whale”

Bill Inmon

Page 39: 2006 Knoware

Project: Health 1990

• CICs, Cobol, DB2

• Performance Indicators– 6 data sources

• 100 MB per year 1 system!

– 2 PCs– 3 days to run– SAS macros

• SUGI: The Use of SAS Formats and Macros in an Information Delivery System

Page 40: 2006 Knoware

Project: Ministry of Social Development 1995

• 300 GB data

• 3,000 tape cartridges

• Mainframe to Unix

• 60 GB disk!– 2 GB file systems

Page 41: 2006 Knoware

Project: Ministry of Social Development 1995

• Standardise the data– Examples:

• 49 files, 4 years, 40 changes• 29 files, 162 tapes, 400 million records, 88 GB

– Reduced to 29 million records, 6 GB

• Proc compare

• SUGI: Loading a Data Warehouse with Legacy Data

Page 42: 2006 Knoware

Tips and Questions

Page 43: 2006 Knoware

Did you know…

The macro function %sysfunc allows you to execute many of the functions that don’t have a macro equivalent, and apply a format to the result

e.g.

title “Result for%sysfunc(today(),wordate9.)%sysfunc(today(),year.)”;

Will generate: Result for November 2006

Page 44: 2006 Knoware

Given the following macro code:

%let word1=SUNZ Conference;%let word2=2006;%let word3=Intercontinental;%let &word3=&word2;%let &word2=&word3;

What will this produce:

%put &word1 &intercontinental;

Page 45: 2006 Knoware

And the correct answer is…..

SUNZ Conference 2006

Page 46: 2006 Knoware

2000s

Page 47: 2006 Knoware

2000s - IT

• Trend to UNIX and Windows servers

• Centralising processing on servers– Good control, security, backup, performance!– The wheel turns full circle

• Minis start to vanish

• Big swing to Web capability

Page 48: 2006 Knoware

2000s - SAS

• 2000 – Enterprise Guide– Move from programming to guided wizards

• 2003 – Version 9– Intelligent, scalable– Multi-tier architecture– Portal approach

• No manuals!– Online

Page 49: 2006 Knoware

Project: Ministry of Social Development

• 11 TB online storage

• Web delivery– Unlocking information for self-service

• Portal approach– Secure delivery of information– Customised and flexible

• SEUGI: Lessons Learnt Web Enabling a Large Data Warehouse

Page 50: 2006 Knoware

Projects: Analytical Servers

• Analytical Servers - drivers– Historically reporting from operational system

or PCs– Need to grow from reporting to analytics– Need better managed, shared environment

• Analytical environment– Unpredictable– Resource intensive– No disruption to regular system usage

Page 51: 2006 Knoware

Projects: BI and DW Solutions

• Full end to end data warehouse/business intelligence environments– ETL or data integration– Data storage and management– Information delivery

• Integrated toolset• Operational reporting• OLAP slice-n-dice• Portals

Page 52: 2006 Knoware

2000s - BI

• 2001 Metagroup– One third of DW projects=do overs

• Data warehouse problems– Performance, ETL

• Storage needs double annually• Move away from pushing reports

– Easy to use– Self service– Flexible

Page 53: 2006 Knoware

2006 and Beyond

• Gartner CIO Survey– BI top technology priority

• ETL to Data Integration• DW Appliances• Finer granularity• Data mining, text mining• Unstructured data• Actionable BI • More data, more information, more quickly

Page 54: 2006 Knoware

Tips and Questions

Page 55: 2006 Knoware

Who is listed in the October 2006, USA Today as #4 in the list of Imaginary Luminaries:

the 101 most influential people who never lived?

Page 56: 2006 Knoware

And the correct answer is

…….

Page 57: 2006 Knoware

The 101 most influential people who never lived:IMAGINARY LUMINARIES: famous, yet fictional

1. The Marlboro Man2. Big Brother3. King Arthur4. Santa Claus (St. Nick)5. Hamlet6. Dr. Frankenstein's Monster7. Siegfried8. Sherlock Holmes9. Romeo and Juliet10. Dr. Jekyll and Mr. Hyde

Notable Exception: The Tooth FairySource: USA Today 17th October, 2006

Page 58: 2006 Knoware

Questions

Page 59: 2006 Knoware

Did you know…

When using Proc Tabulate, the default width of the 1st column (the row title column, identifying each row) is ¼ of the linesize. The RTS= option allows you to set you own width.

RTS stands for Row Title Space.

Continued….

Page 60: 2006 Knoware

e.g.

Linesize is 100 therefore the default first column will be 25 chars wide.

PROC TABULATE data=mytabfile;

table region,amount / RTS=12;

etc…

This will reset the width to 12 chars. Note that the 12 includes the two characters used to define the left and right box borders, so only 10 characters remain for your value label!

Page 61: 2006 Knoware

In Proc Tabulate, which option allows you to define text for the cell in the top left corner of the

table?

Page 62: 2006 Knoware

And the correct answer is…..

Box=

Page 63: 2006 Knoware

Did you know… The FORMAT procedure and the PUT

function can combine to make a very efficient filter

e.g.proc format;

value filter 14-high=‘old’;

run;

proc print data=sashelp.class;

where put(age,filter.) =‘old’;

run;

Page 64: 2006 Knoware

Which function is used to convert a character value to numeric?

Page 65: 2006 Knoware

And the correct answer is…..

INPUT

Page 66: 2006 Knoware

Bonus Question:

Page 67: 2006 Knoware

What was Clare doing in 1982?

Page 68: 2006 Knoware

The road to Knoware: 25 years of SAS

Clare Somerville

Mark Houliston

Colin Harris

Brian Bee