Data Alchemy: Turn your Data into Gold

90
Data Alchemy Turn your Data into Gold CEO, dkd Internet Service GmbH Søren Schaffstein

Transcript of Data Alchemy: Turn your Data into Gold

Page 1: Data Alchemy: Turn your Data into Gold

Data Alchemy Turn your Data into Gold

CEO, dkd Internet Service GmbHSøren Schaffstein

Page 2: Data Alchemy: Turn your Data into Gold

Disclaimer

Page 3: Data Alchemy: Turn your Data into Gold

The server cannot meet the requirements of the Expect request-header field.

417 Expectation Failed

Page 4: Data Alchemy: Turn your Data into Gold

• Statistics lesson

• Cat pictures

• Bullet lists

• Overview of useful tools for

data analysis and manipulation

• Very easy code examples

• Real world applications

Expectation Failed

Page 5: Data Alchemy: Turn your Data into Gold

Slideshare

Page 6: Data Alchemy: Turn your Data into Gold

References & Credits

Page 7: Data Alchemy: Turn your Data into Gold

Me, myself and ISøren Schaffstein, CEO of dkd Internet Service GmbH

[email protected]

Page 8: Data Alchemy: Turn your Data into Gold
Page 9: Data Alchemy: Turn your Data into Gold
Page 10: Data Alchemy: Turn your Data into Gold
Page 11: Data Alchemy: Turn your Data into Gold
Page 12: Data Alchemy: Turn your Data into Gold
Page 13: Data Alchemy: Turn your Data into Gold

Where’s the Data?Close your eyes and imagine…

Page 14: Data Alchemy: Turn your Data into Gold

Accounting

Page 15: Data Alchemy: Turn your Data into Gold

Time Records

Page 16: Data Alchemy: Turn your Data into Gold

Estimations

Page 17: Data Alchemy: Turn your Data into Gold

Product Specifications

Page 18: Data Alchemy: Turn your Data into Gold
Page 19: Data Alchemy: Turn your Data into Gold

Which tools do you use?Getting data into shape

Page 20: Data Alchemy: Turn your Data into Gold

There’s always

Page 21: Data Alchemy: Turn your Data into Gold

Excel has it’s limitations

Worksheet size limit: 1,048,576 rows by 16,384 columns

Microsoft keeps a dedicated page with the limits of Excel. This page has nearly 90 entries.

Page 22: Data Alchemy: Turn your Data into Gold

What is R?18th letter of the modern English alphabet

Page 23: Data Alchemy: Turn your Data into Gold

TO ERR IS HUMAN,

TO ARR IS PIRATE

Page 24: Data Alchemy: Turn your Data into Gold

R

Page 25: Data Alchemy: Turn your Data into Gold

https://www.r-project.org/about.html

»R is a language and environment for statistical computing and

graphics.«

Page 26: Data Alchemy: Turn your Data into Gold

Starting up RR version 3.3.1 (2016-06-21) -- "Bug in Your Hair"Copyright (C) 2016 The R Foundation for Statistical ComputingPlatform: x86_64-apple-darwin13.4.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.You are welcome to redistribute it under certain conditions.Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.Type 'contributors()' for more information and'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or'help.start()' for an HTML browser interface to help.Type 'q()' to quit R.

>

Page 27: Data Alchemy: Turn your Data into Gold

Without the welcome message>

Page 28: Data Alchemy: Turn your Data into Gold

Arithmetics> 25 + 17[1] 42

Page 29: Data Alchemy: Turn your Data into Gold

Fun starts with vectors> x <- c(1, 2, 3)

> y <- c(10, 20, 30)

> x + y[1] 11 22 33

> sum(x)[1] 6

Page 30: Data Alchemy: Turn your Data into Gold

Let’s work with more data> elements element symbol protons weight density 1 Gold Au 79 196.97 19300 2 Silver Ag 47 107.87 10490 3 Mercury Hg 80 200.59 13534 4 Lead Pb 82 207.20 11340

> max(elements$density) [1] 19300

> elements$density2 <- elements$density / 1000

density2 19.300 10.490 13.534 11.340

Page 31: Data Alchemy: Turn your Data into Gold

Geiger Counter

Page 32: Data Alchemy: Turn your Data into Gold

Libelium Geiger Counter Shield

Page 33: Data Alchemy: Turn your Data into Gold

Libelium Geiger Counter Shield• Counts pulses detected in tube • Calculates “counts per minute” • Calculates radiation dose in μSv/h • Logs reading every 10 seconds

• 8,640 value pairs per 24h

Page 34: Data Alchemy: Turn your Data into Gold

Some Geiger Counter records> tail(geiger) Timestamp CPM uSv DateTime 432 1477302659 48 0.3898 2016-10-24 11:50:59 433 1477302669 24 0.1949 2016-10-24 11:51:09 434 1477302679 30 0.2436 2016-10-24 11:51:19 435 1477302689 36 0.2923 2016-10-24 11:51:29 436 1477302699 30 0.2436 2016-10-24 11:51:39 437 1477302709 18 0.1462 2016-10-24 11:51:49

> mean(geiger$CPM)[1] 30.94737

Page 35: Data Alchemy: Turn your Data into Gold

Select your preferred records> subset(geiger, geiger$CPM > 60) Timestamp CPM uSv DateTime 15 1477298489 66 0.5359 2016-10-24 10:41:29 207 1477300409 66 0.5359 2016-10-24 11:13:29 379 1477302129 66 0.5359 2016-10-24 11:42:09 431 1477302649 66 0.5359 2016-10-24 11:50:49

Page 36: Data Alchemy: Turn your Data into Gold

Graphics, of course

Page 37: Data Alchemy: Turn your Data into Gold

Graphics, of course

Page 38: Data Alchemy: Turn your Data into Gold

Cool in R

Page 39: Data Alchemy: Turn your Data into Gold

Package Repository

Page 40: Data Alchemy: Turn your Data into Gold

It’s quite fast

Page 41: Data Alchemy: Turn your Data into Gold

Can process a lot of data

Page 42: Data Alchemy: Turn your Data into Gold
Page 43: Data Alchemy: Turn your Data into Gold

Comparing Excel to R

Excel R

Page 44: Data Alchemy: Turn your Data into Gold

But this is only the beginning…

Page 45: Data Alchemy: Turn your Data into Gold

Let’s add more ToolsDid anyone say “Swiss Army Knife”?

Page 46: Data Alchemy: Turn your Data into Gold

RStudio

Page 47: Data Alchemy: Turn your Data into Gold

Debugging Tools

Console

Syntax-highlighting editor

Direct code execution in editor

File BrowserPackage Manager

Page 48: Data Alchemy: Turn your Data into Gold

OpenCPU

Page 49: Data Alchemy: Turn your Data into Gold

–https://www.opencpu.org/

»The OpenCPU server provides a reliable and interoperable HTTP

API for data analysis based on R.«

Page 50: Data Alchemy: Turn your Data into Gold

Access R via JavaScript// perform request to OpenCPU var req = ocpu.call("getOvertime", { username : username

}, function(output) { $(“#output").text(output.message); } );

Page 51: Data Alchemy: Turn your Data into Gold
Page 52: Data Alchemy: Turn your Data into Gold
Page 53: Data Alchemy: Turn your Data into Gold

LaTeX

Page 54: Data Alchemy: Turn your Data into Gold

pronounced: /ˈlɑːtɛk/ LAH-tek

Page 55: Data Alchemy: Turn your Data into Gold

LaTeX – A document preparation system

LaTeX is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation.

– www.latex-project.org

Page 56: Data Alchemy: Turn your Data into Gold

A very short introduction into LaTeX

Like in a nutshell

Page 57: Data Alchemy: Turn your Data into Gold

How to create a PDF with LaTeX?

Page 58: Data Alchemy: Turn your Data into Gold

A really simple LaTeX document\documentclass{report}

\begin{document}

A Long-expected Party

\end{document}

Page 59: Data Alchemy: Turn your Data into Gold

A simple LaTeX document\documentclass{report}

\begin{document}

\chapter{A Long-expected Party}

When Mr. Bilbo Baggins of Bag End announced that he would shortly be celebrating his eleventy-first birthday with a party of special magnificence, there was much talk and excitement in Hobbiton.

\end{document}

Page 60: Data Alchemy: Turn your Data into Gold

A still simple LaTeX document\documentclass{dkd-technical-documentation}

\begin{document}

\chapter{A Long-expected Party}

When Mr. Bilbo Baggins of Bag End announced that he would shortly be celebrating his eleventy-first birthday with a party of special magnificence, there was much talk and excitement in Hobbiton.

\end{document}

Page 61: Data Alchemy: Turn your Data into Gold

• LaTeX seamlessly integrates with R • Place R-Variables directly in LaTeX-Templates • Write R-Code in LaTeX-Templates

Why LaTeX and R?

Page 62: Data Alchemy: Turn your Data into Gold

FreeboardAll on one dashboard

Page 63: Data Alchemy: Turn your Data into Gold
Page 64: Data Alchemy: Turn your Data into Gold

The Big PictureHow everything clicks together

Page 65: Data Alchemy: Turn your Data into Gold

RStudio

Data Sources

Web Frontend

CSV

XMLExcel

JSON

R

OpenCPU

REST-API Target Format

CSV

XMLExcel

JSON

LaTeX PDF

Other tools

Page 66: Data Alchemy: Turn your Data into Gold

Real StuffOK, until now that was quite some theoretical stuff

Page 67: Data Alchemy: Turn your Data into Gold

Client ReportOverview of a client’s revenues and projects

Page 68: Data Alchemy: Turn your Data into Gold
Page 69: Data Alchemy: Turn your Data into Gold
Page 70: Data Alchemy: Turn your Data into Gold
Page 71: Data Alchemy: Turn your Data into Gold
Page 72: Data Alchemy: Turn your Data into Gold
Page 73: Data Alchemy: Turn your Data into Gold

TimesheetSpecial timesheet for an EU project

Page 74: Data Alchemy: Turn your Data into Gold
Page 75: Data Alchemy: Turn your Data into Gold

Uncharged ServicesAn automated monthly report

Page 76: Data Alchemy: Turn your Data into Gold
Page 77: Data Alchemy: Turn your Data into Gold
Page 78: Data Alchemy: Turn your Data into Gold

Summing it up

Page 79: Data Alchemy: Turn your Data into Gold

What to take home

Page 80: Data Alchemy: Turn your Data into Gold

Versatile Tool

Page 81: Data Alchemy: Turn your Data into Gold

Connecting Tools

Page 82: Data Alchemy: Turn your Data into Gold

dkd offers R Alchemy

Page 83: Data Alchemy: Turn your Data into Gold

Questions

Page 84: Data Alchemy: Turn your Data into Gold

The one questionCan you turn lead into gold?

Page 85: Data Alchemy: Turn your Data into Gold

Chrysopoeia: yes, we canChrysopoeia, the artificial production of gold, is the symbolic goal of alchemists. Such transmutation is possible in particle accelerators or nuclear reactors, although the production cost is currently many times the market price of gold.

Page 86: Data Alchemy: Turn your Data into Gold

Thank you for listening

Page 87: Data Alchemy: Turn your Data into Gold

www.dkd.dedkd Internet Service GmbH . Kaiserstraße 73 . 60329 Frankfurt am Main

References“Data Alchemy: Turn your Data into Gold”

Author: Søren SchaffsteinDate: 2016-10-27

References

References

[8] Dynamic Periodic Table. URL: http://www.ptable.com/ (visitedon 10/24/2016).

[9] Excel specifications and limits - Excel. URL: https://support.office.com/en-US/article/excel-specifications-and-limits- 1672b34d- 7043- 467e- 8e27- 269d656771c3 (visitedon 10/19/2016).

[13] freeboard - Dashboards For the Internet Of Things. URL: https://freeboard.io/ (visited on 10/24/2016).

[19] Libelium. Geiger Counter - Radiation Sensor Board for Arduinoand Raspberry Pi. URL: https://www.cooking- hacks.com/documentation / tutorials / geiger - counter - radiation -sensor-board-arduino-raspberry-pi-tutorial (visited on10/24/2016).

[25] R (programming language). Page Version ID: 744362604. Oct. 14,2016. URL: https : / / en . wikipedia . org / w / index . php ?title=R_(programming_language)&oldid=744362604 (vis-ited on 10/24/2016).

[41] Synthesis of preciousmetals. Page Version ID: 744728054. Oct. 17,2016. URL: https : / / en . wikipedia . org / w / index . php ?title=Synthesis_of_precious_metals&oldid=744728054(visited on 10/18/2016).

Images

[1] Alan Levine. Network | Electric junctions on the Pine AZ SeniorThrift Ce… (Slide: Connecting Tools). URL: https://www.flickr.com/photos/cogdog/4317096083/ (visited on 10/26/2016).

[2] Alexander. pocket army knife (Slide: Versatile Tool). URL: https://de.fotolia.com/id/956509.

[3] Andras Kovacs. Freepik | gold Photo (Slide: Chrysopoeia: yes, wecan). URL: http://www.freepik.com/index.php?goto=41&idd=39995&url=aHR0cDovL3d3dy5zeGMuaHUvcGhvdG8vMjI3OTg3#(visited on 10/24/2016).

Seite 1/1

Page 88: Data Alchemy: Turn your Data into Gold

[4] Andreia. 700 years (Slide: Package Repository). URL: https://www.flickr.com/photos/iseecat/15045038073/ (visited on10/24/2016).

[5] Brian Shamblen. 24 Hours of LeMons (Slide: It’s quite fast). URL:https://www.flickr.com/photos/23972840@N04/11283383556/(visited on 10/24/2016).

[6] Brooke’s Bargains. Fisher-Price HandyManny Ripp Chain Saw (Slide:Comparing Excel to R). URL: http://www.brookesbargains.com/handy-manny-rip-chainsaw-and-flicker-flashlight-only-4-99-each/.

[7] Dan Backman. Apothecary | Aria, North Beach, SF (Slide: dkd of-fers R Alchemy). URL: https : / / www . flickr . com / photos /dbackmansfo/4716004445/ (visited on 10/26/2016).

[10] eynermedia. Chemistry | Alchemist laboratory interior (Slide: Pre-sentation Title). URL: https : / / www . flickr . com / photos /89228431@N06/11080396405/ (visited on 10/24/2016).

[11] frau-Vogel. jemand hatte ... (Slide: The one question). URL: https://www.flickr.com/photos/frau-vogel/3773399470 (visitedon 10/24/2016).

[12] Freeboard. freeboard - Dashboards For the Internet Of Things(Slide: Freeboard). URL: https://freeboard.io/ (visited on10/24/2016).

[14] Freepik. Blue alchemy symbols badges (Alchemy symbols in back-ground). URL: http://www.freepik.com/free-vector/blue-alchemy-symbols-badges_850849.htm (visited on 10/12/2016).

[15] Freepik.Hand drawn alchemy elements (Alchemy symbols in back-ground). URL: http://www.freepik.com/free-vector/hand-drawn-alchemy-elements_849702.htm (visited on 10/24/2016).

[16] Henry Burrows. Ulsan Express (Slide: Can process a lot of data).URL: https://www.flickr.com/photos/foilman/15587276942/(visited on 10/24/2016).

[17] Joan Kimball. Tasche Uhren in einem Bündel Stockfoto (Slide:Time Records). URL: http : / / www . istockphoto . com / de /foto/tasche-uhren-in-einem-b%C3%BCndel-gm110908324-1782503?st=_p_1782503.

[18] Joshua Lyon. Microsoft Excel Error Message (Slide: Excel has it’slimitations). URL: http : / / boshdirect . com / Blogs / Tech /microsoft-excel-error-message.html (visited on 10/19/2016).

[20] Libelium. Geiger Counter - Radiation Sensor Board for Arduino(Slide: Libelium Geiger Counter Shield). URL: https : / / www .cooking- hacks.com/documentation/tutorials/geiger-counter- radiation- sensor- board- arduino- raspberry-pi-tutorial (visited on 10/24/2016).

[21] Mathias Pastwa. social network hub (Slide: Slideshare). URL: https://www.flickr.com/photos/mpastwa/2671066786/ (visited on10/12/2016).

Seite 2/1

Page 89: Data Alchemy: Turn your Data into Gold

[22] MS880 - Top-Modell: Imposante 6,4kW-Hochleistungssäge (Slide:Comparing Excel to R). URL: http://www.stihl.de/produkt.aspx?idModel=658&idMarketingGroup=1582&realurl=/STIHL-Produkte/Motors%C3%A4gen-und-Kettens%C3%A4gen/S%C3%A4gen- f%C3%BCr- die- Forstwirtschaft/2658- 1582/MS-880.aspx (visited on 10/19/2016).

[23] NASA Goddard Space Flight Center. Antares Rocket With CygnusSpacecraft Launches (Slide: This is only the beginning!)URL: https:/ / www . flickr . com / photos / gsfc / 9807812154 (visited on11/27/2014).

[24] Quinn Dombrowski. Beer sampler (Slide: Questions). URL: https://www.flickr.com/photos/quinndombrowski/5200218267/(visited on 10/24/2016).

[26] Rafael Araujo. Rafael Araujo - Calculation 20 (Slide: Product Spec-ifications). Sept. 10, 2014. URL: https://www.flickr.com/photos/eager/15009713669/ (visited on 10/19/2016).

[27] Rob Shenk. Tools of the Trade (Slide: RStudio). URL: https://www.flickr.com/photos/rcsj/8060829057/ (visited on 10/24/2016).

[28] Søren Schaffstein. Climbing (Slide: Climbing).

[29] Søren Schaffstein. Donut (Slide: Photography).

[30] Søren Schaffstein. Glass of Sweets (Slide: Estimations).

[31] Søren Schaffstein. Gold Coins (Slide: Accounting).

[32] Søren Schaffstein. Half-Pint Heroes (Slide: Board Game Author).

[33] Søren Schaffstein. Illuminated Bicycle (Slide: Travelling).

[34] Søren Schaffstein. Pizza (Slide: About dkd).

[35] Søren Schaffstein. Protected Place (Slide: Disclaimer).

[36] Sebastiaan ter Burg. Left channel mono - Stereo - right chan-nel mono selector p… (Slide: OpenCPU). URL: https://www.flickr . com / photos / ter - burg / 14831362160/ (visited on10/24/2016).

[37] Staffan Scherz. Art of Transportation (Slide: What to take home).URL: https://www.flickr.com/photos/staffanscherz/6161284551/ (visited on 10/22/2016).

[38] Steve Jurvetson. Civil Defense (Slide: Geiger Counter). URL: https://www.flickr.com/photos/jurvetson/7599588998/ (visitedon 10/12/2016).

[39] Sweet Chili Arts. Open source free culture creative commons cul-ture pioneers (Slide: Open Source). Jan. 18, 2012. URL: https://www.flickr.com/photos/74611013@N02/6721910825 (vis-ited on 11/26/2014).

[40] sxc. Pot of gold (Slide: Thank you for listening). URL: http://www.freepik.com/free-photo/pot-of-gold_633785.htm(visited on 10/24/2016).

[42] Tara Schmidt. Ice (Slide: Cool in R). URL: https://www.flickr.com/photos/taramarie/16012526920/ (visited on 10/21/2016).

Seite 3/1

Page 90: Data Alchemy: Turn your Data into Gold

[43] taymtaym. Girl in Latex Suit (Slide: LaTeX). URL: https://www.flickr.com/photos/taymtaym/13663386063.

[44] twitter.com/mattwi1s0n. Internet Email (Slide: E-Mail). URL: https://www.flickr.com/photos/piccadillywilson/68766132(visited on 10/26/2016).

All trademarks, trade names, product names and logos appearing inthis presentation are the property of their respective owners, includingin some instances dkd. Any rights not expressly granted herein arereserved.

Seite 4/1