CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33...

33
1/33 Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion Reference CS-495/595 Big Data processing concepts (part 2) Lecture #2 Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015 28 Jan. 2015

Transcript of CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33...

Page 1: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

1/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

CS-495/595Big Data processing concepts (part 2)

Lecture #2

Dr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck Cartledge

28 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 201528 Jan. 2015

Page 2: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

2/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Table of contents I1 Concepts

2 More

3 Messy

4 Correlation

5 Datafication

6 Value

7 Implications

8 Break

9 Risks

10 Control

11 Assignment

12 Conclusion

13 References

Page 3: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

3/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Another view of Big Data

A nexus

Things that can only bedone at large scale (can’t dothem at small scale)

Extract new insights

Create new forms of value

Change relationshipsbetween markets,organizations, people, andgovernments

Coles vs. Woolworth in Australia, an analytics firm changedeverything [10].

Page 4: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

4/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Another view of Big Data

A nexus

Data munging, datawrangling

Cleansing, normalization

Various sources

Various formats

Various qualities

Page 5: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

5/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Another view of Big Data

Programming

Amdahl’s law

Partition: C/C++ fork(),java thread(), C# fork(),MPI Init(),MPI Comm size(), MPI . . .

Page 6: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

6/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Another view of Big Data

Distillation of data into information

Intended user

Data visualization

Intended message

Medium

Edward Tufte Figure: Napoleon’s march into andout of Russia.

Charles Joseph Minard (1781 — 1870)[13]

Page 7: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

7/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Another view of Big Data

A Good Read

“Big Data,” by Mayer-Schonberger and Cukier [5]

National bestseller

Popular non-fiction

Easy read

No pictures

No math

Just prose

The basis of today’s lecture. A good book to refer to; not arequired text.

Page 8: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

8/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Another view of Big Data

Rituals. We all have them.

Page 9: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

9/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

How many samples is enough?

n=1,000 vs. n=all

Classical (frequentism)

Bayesian

Difference between conventionaldigital cameras and a Lytrocamera

1 Focusing on one plane, vs

2 Capturing it all and postprocessing

Collect all the data and see where it takes you.

Page 10: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

10/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

How many samples is enough?

Albert-Laszlo Barabasi and cell phones

Looking at how cell-phonesconnect to each other

n=0.2 European populationfor 4 month period

Graph theory perspective,the connections created a“small world” graph

Image from [7]

Unexpected insights found by following the data.

Page 11: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

11/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Precision vs. accuracy

Precision – closeness of themeasurements

Accuracy – closeness to theactual underlying value

With small datasets, they have to be accurate and precise. Largedatasets are more accurate because they contain all of life’smessiness.

Page 12: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

12/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Precision vs. accuracy

Natural language processing

Brown Corpus – 1960(≈ 1, 000, 000 curatedwords)

Google words – 2006(≈ 1, 000, 000, 000, 000uncurated words)

“Data in the wild.” [3]

“. . . simple models and a lot of data trump more elaborate modelsbased on less data.”[3]

Page 13: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

13/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Precision vs. accuracy

Neat and tidy vs. messy

SQL databases (only 5% ofdata is structured) – neatand tidy

noSQL databases (datamodeled other than in atable) – messy

1 Graph2 Column3 Document4 Persistent Key value pair5 Volatile Key value pair

The world does not live by SQL databases alone. There are somedata that are more efficiently modeled by things other than tables.

Page 14: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

14/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Predictive analytics

Correlations show what not why.

How New York City got a handle on exploding manhole covers.

94,000 miles of underground cable5% laid before 193051,000 manhole covers and serviceboxesRecords from 1880s (messy, messydata)Look for correlations – 106 foundTrained with data up to 2009Tested with data from 2009 – 10%manholes accounted for 44%problems

Primary indicators – (1) age of the cables, (2) previous problemswith that manhole [9].

Page 15: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

15/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Predictive analytics

Is this the end of theory as we know it?

“There is now a better way.Petabytes allow us to say:’Correlation is enough.’ We canstop looking for models. We cananalyze the data withouthypotheses about what it mightshow. We can throw thenumbers into the biggestcomputing clusters the world hasever seen and let statisticalalgorithms find patterns wherescience cannot.” [1]

Figure: The Iowa agriculturelandscape: Green areas are moreproductive for soy, corn, and wheat;red are least.

Correlation will tell us what is. Theories will tell us why.

Page 16: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

16/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Quantifying the world

When words become data.

Google starts to scan all the booksit can (2004)

Scans are turned into words

Google scan approx. 20 million(15% of all books) by 2012

Words over time(https://books.google.com/ngrams)

New science of “culturomics” [6]

Undefined words

Some companies have been able to capitalize on the value ofwords, others haven’t.

Page 17: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

17/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Quantifying the world

When location becomes data.

A few lessons from UPS [12].

Fitted vehicles with sensors, GPS,and wireless modulesApplied analytics to the dataReduced driver’s routes by30,000,000 milesReduced fuel by 3,000,000 gallonsReduced carbon-dioxide emissionsby 30,000 metric tonsGave preference to right hand turnsvice crossing turnsImproved efficiency and safety

Application of location data and graph theory to compute shortestpaths.

Page 18: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

18/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Luis von Ahn and killing spambots

Wanted to thwart spambots, but allowhumans

Something that was HARD forcomputers, but easy for humans

At 22 came up with the idea for theCompletely Automated Public TuringTest to Tell Computers and HumansApart (captcha)

At 27 as PhD, awarded MacArthurFoundation “genius” award

Designed ReCapthca – crowd sourcedcharacter recognition

Minimal human cost, aggregated tovery large worth ($750,000,000 peryear)

One man’s trash is another man’s treasure.

Page 19: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

19/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

The “option value” of data.

Information generated for one purpose can be reused. Data movesfrom primary to secondary uses.

Reuse of search terms –predictive search hints

Reuse of 1890s electricalcable information in NYC

Ways to unleash data’svalue:

1 Basic reuse2 Merging datasets3 Finding “twofers”

Figure: Image fromhttp://www.cognizant.com/latest-thinking/perspectives/dealing-with-big-data

Ultimate value of data is what one can gain from all the possibleways in can be employed.

Page 20: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

20/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

The value of open data.

Big companies have the resources and need to address big data.Government is the biggest company, and it can compel others to give itdata.

US Government – President Obamadirected government to provide datahttp://www.data.gov

UK Open Data Initiative –http://data.gov.uk/

European Union – EU Open DataPortal – https://open-data.europa.eu/

Other governmental levels as well

With government acting as the data broker, others can act as visionariesand scientists to provide services, provide products, and create wealth.

Page 21: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

21/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

What happens when we start to use the crystal ball?

Too few data scientists

We need more of them now[4]

Data is widely available andvitally important

Rise of data brokers, datavisionaries, data scientists Figure: Image from

http://blog.sqlauthority.com/2013/10/25/big-data-how-to-become-a-data-scientist-and-learn-data-science-day-19-of-21/

Demise of the “expert” because we let the data lead us. Contradiction.

Page 22: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

22/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

What happens when we start to use the crystal ball?

Unexpected places where Big Data is used:

Baseball – The book Moneyball

On line education – Courserareviews where students replay themost

Neonatal care – predictions basedon what worked in the past

Ship tracking – Marine Trafficwebsite

Cars tracking – Inrix geo-location100 million cars in US

We are awash in data. People are using it for all sorts of things.

Page 23: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

23/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Break time.

Take about 10 minutes.

Page 24: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

24/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Big Brother(s) is watching you.

Each of us has a penumbra of data.

Memphis, TN – Blue CrimeReduction Utilizing StatisticalHistory (CRUSH) [8]Richmond, VA – correlates types ofarrests with when and where [11]DHS – Future Attribute ScreeningTechnology (FAST)/PassiveMethods for Precision BehavioralScreening “. . . developphysiological and behavioralscreening technologies . . . ”

“I’m placing you under arrest for the future murder of SarahMarks, that was to take place today . . . ” [2]

Page 25: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

25/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

Big Brother(s) is watching you.

The dark side of Big Data.

“. . . big data allows for moresurveillance of our lives while itmakes some of the legal meansfor protecting privacy largelyobsolete. It also rendersineffective the core technicalmethod of preserving anonymity.. . . ” [5]

Big data is powerful. It is seductive. “. . . the possession of greatpower necessarily implies great responsibility.”

Page 26: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

26/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

. . . what’s past is prologue . . . – Shakespeare, “The Tempest”

What is past?

Gutenberg is credited with creatingmovable type

Created an explosion of books andliteracy

Gave rise to copyright and protections

Big Data is an explosion

Need to create protections

America On Line released“anonymized” search logs

Data available 4 Aug 2006, removed 7Aug 2006

Big Data techniques (not by thatname) identified users

A single thread is weak. A rope made from many threads is strong.

Page 27: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

27/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

. . . what’s past is prologue . . . – Shakespeare, “The Tempest”

Protections

The internet “never forgets”

Interest in the “right toforget”

How to mandateforgetfulness??

How to ensureforgetfulness??

People evolve and change. They should be judged by what theydo, not by what their past predicts they will do.

Page 28: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

28/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

A “Hello World” level problem.

With a little license.

A simply stated problem: Countthe number of unique words inShakespeare’s Macbeth.

A few Java classes

A Hadoop environment

Process strings from a file

Summarize the results

Grad students have a little moreto do.

Page 29: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

29/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

What have we covered?

“ Big data is about what, not why.We don’t always need to know thecause of a phenomenon; rather, wecan let the data speak for itself.”“Yet the most important reason forthe program’s success was that itdispensed with reliance oncausation in favor of correlation.”Let the data guide you vice“knowing” what knowing what theanswer is and finding data tosupport your assumptions.

Next lecture: Hadoop book, Chapters 1, 2, and 5

Page 30: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

30/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

References I

[1] Chris Anderson, The end of theory, Wired magazine 16(2008), no. 7, 16–07.

[2] Steven Spielberg Director, The minority report, TwentiethCentury Fox Film Corporation, 2002.

[3] Alon Halevy, Peter Norvig, and Fernando Pereira, Theunreasonable effectiveness of data, Intelligent Systems, IEEE24 (2009), no. 2, 8–12.

[4] James Manyika, Michael Chui, Brad Brown, Jacques Bughin,Richard Dobbs, Charles Roxburgh, and Angela H Byers, Bigdata: The next frontier for innovation, competition, andproductivity, McKinsey Global Institute (2011).

Page 31: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

31/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

References II

[5] Viktor Mayer-Schonberger and Kenneth Cukier, Big data: Arevolution that will transform how we live, work, and think,Houghton Mifflin Harcourt, 2013.

[6] Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden,Adrian Veres, Matthew K Gray, Joseph P Pickett, DaleHoiberg, Dan Clancy, Peter Norvig, Jon Orwant, et al.,Quantitative analysis of culture using millions of digitizedbooks, science 331 (2011), no. 6014, 176–182.

[7] J-P Onnela, Jari Saramaki, Jorkki Hyvonen, Gyorgy Szabo,David Lazer, Kimmo Kaski, Janos Kertesz, and A-L Barabasi,Structure and tie strengths in mobile communicationnetworks, Proceedings of the National Academy of Sciences104 (2007), no. 18, 7332–7336.

Page 32: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

32/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

References III

[8] Chris Peck, Blue crush controversy,http://www.commercialappeal.com/opinion/blue-

crush-controversy, 2013.

[9] Cynthia Rudin, David Waltz, Roger N Anderson, AlbertBoulanger, Ansaf Salleb-Aouissi, Maggie Chow, HaimontiDutta, Philip N Gross, Bert Huang, Steve Ierome, et al.,Machine learning for the new york city power grid, PatternAnalysis and Machine Intelligence, IEEE Transactions on 34(2012), no. 2, 328–345.

[10] Mercedes Ruehl, Coles, woolies and the big data arms race,http:

//www.brw.com.au/p/tech-gadgets/coles_woolies_

and_the_big_data_arms_4I2P2oieDKZGdev5aY778H, 2013.

Page 33: CS-495/595 Big Data processing concepts (part 2) Lecture ...ccartled/Teaching/2015... · 1/33 Concepts More Messy Correlation Data cation Value ImplicationsBreak Risks Control AssignmentConclusionReferences

33/33

Concepts More Messy Correlation Datafication Value Implications Break Risks Control Assignment Conclusion References

References IV

[11] Informs Staff, Police department wins gartner’s 2007 biexcellence award, http://www.informationweek.com/software/information-

management/police-department-wins-gartners-2007-

bi-excellence-award/d/d-id/1053178?, 2007.

[12] , Ups wins gartner bi excellence award,https://www.informs.org/Announcements/UPS-wins-

Gartner-BI-Excellence-Award, 2011.

[13] Wikipedia, Charles joseph minard — wikipedia, the freeencyclopedia, http://en.wikipedia.org/wiki/Charles_Joseph_Minard,2014.