Data Alchemy: Turn your Data into Gold

Post on 19-Jan-2017

144 views 0 download

Transcript of Data Alchemy: Turn your Data into Gold

Data Alchemy Turn your Data into Gold

CEO, dkd Internet Service GmbHSøren Schaffstein

Disclaimer

The server cannot meet the requirements of the Expect request-header field.

417 Expectation Failed

• Statistics lesson

• Cat pictures

• Bullet lists

• Overview of useful tools for

data analysis and manipulation

• Very easy code examples

• Real world applications

Expectation Failed

Slideshare

References & Credits

Me, myself and ISøren Schaffstein, CEO of dkd Internet Service GmbH

soeren.schaffstein@dkd.de

Where’s the Data?Close your eyes and imagine…

Accounting

Time Records

Estimations

Product Specifications

Which tools do you use?Getting data into shape

There’s always

Excel has it’s limitations

Worksheet size limit: 1,048,576 rows by 16,384 columns

Microsoft keeps a dedicated page with the limits of Excel. This page has nearly 90 entries.

What is R?18th letter of the modern English alphabet

TO ERR IS HUMAN,

TO ARR IS PIRATE

R

https://www.r-project.org/about.html

»R is a language and environment for statistical computing and

graphics.«

Starting up RR version 3.3.1 (2016-06-21) -- "Bug in Your Hair"Copyright (C) 2016 The R Foundation for Statistical ComputingPlatform: x86_64-apple-darwin13.4.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.You are welcome to redistribute it under certain conditions.Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.Type 'contributors()' for more information and'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or'help.start()' for an HTML browser interface to help.Type 'q()' to quit R.

>

Without the welcome message>

Arithmetics> 25 + 17[1] 42

Fun starts with vectors> x <- c(1, 2, 3)

> y <- c(10, 20, 30)

> x + y[1] 11 22 33

> sum(x)[1] 6

Let’s work with more data> elements element symbol protons weight density 1 Gold Au 79 196.97 19300 2 Silver Ag 47 107.87 10490 3 Mercury Hg 80 200.59 13534 4 Lead Pb 82 207.20 11340

> max(elements$density) [1] 19300

> elements$density2 <- elements$density / 1000

density2 19.300 10.490 13.534 11.340

Geiger Counter

Libelium Geiger Counter Shield

Libelium Geiger Counter Shield• Counts pulses detected in tube • Calculates “counts per minute” • Calculates radiation dose in μSv/h • Logs reading every 10 seconds

• 8,640 value pairs per 24h

Some Geiger Counter records> tail(geiger) Timestamp CPM uSv DateTime 432 1477302659 48 0.3898 2016-10-24 11:50:59 433 1477302669 24 0.1949 2016-10-24 11:51:09 434 1477302679 30 0.2436 2016-10-24 11:51:19 435 1477302689 36 0.2923 2016-10-24 11:51:29 436 1477302699 30 0.2436 2016-10-24 11:51:39 437 1477302709 18 0.1462 2016-10-24 11:51:49

> mean(geiger$CPM)[1] 30.94737

Select your preferred records> subset(geiger, geiger$CPM > 60) Timestamp CPM uSv DateTime 15 1477298489 66 0.5359 2016-10-24 10:41:29 207 1477300409 66 0.5359 2016-10-24 11:13:29 379 1477302129 66 0.5359 2016-10-24 11:42:09 431 1477302649 66 0.5359 2016-10-24 11:50:49

Graphics, of course

Graphics, of course

Cool in R

Package Repository

It’s quite fast

Can process a lot of data

Comparing Excel to R

Excel R

But this is only the beginning…

Let’s add more ToolsDid anyone say “Swiss Army Knife”?

RStudio

Debugging Tools

Console

Syntax-highlighting editor

Direct code execution in editor

File BrowserPackage Manager

OpenCPU

–https://www.opencpu.org/

»The OpenCPU server provides a reliable and interoperable HTTP

API for data analysis based on R.«

Access R via JavaScript// perform request to OpenCPU var req = ocpu.call("getOvertime", { username : username

}, function(output) { $(“#output").text(output.message); } );

LaTeX

pronounced: /ˈlɑːtɛk/ LAH-tek

LaTeX – A document preparation system

LaTeX is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation.

– www.latex-project.org

A very short introduction into LaTeX

Like in a nutshell

How to create a PDF with LaTeX?

A really simple LaTeX document\documentclass{report}

\begin{document}

A Long-expected Party

\end{document}

A simple LaTeX document\documentclass{report}

\begin{document}

\chapter{A Long-expected Party}

When Mr. Bilbo Baggins of Bag End announced that he would shortly be celebrating his eleventy-first birthday with a party of special magnificence, there was much talk and excitement in Hobbiton.

\end{document}

A still simple LaTeX document\documentclass{dkd-technical-documentation}

\begin{document}

\chapter{A Long-expected Party}

When Mr. Bilbo Baggins of Bag End announced that he would shortly be celebrating his eleventy-first birthday with a party of special magnificence, there was much talk and excitement in Hobbiton.

\end{document}

• LaTeX seamlessly integrates with R • Place R-Variables directly in LaTeX-Templates • Write R-Code in LaTeX-Templates

Why LaTeX and R?

FreeboardAll on one dashboard

The Big PictureHow everything clicks together

RStudio

Data Sources

Web Frontend

CSV

XMLExcel

JSON

R

OpenCPU

REST-API Target Format

CSV

XMLExcel

JSON

LaTeX PDF

Other tools

Real StuffOK, until now that was quite some theoretical stuff

Client ReportOverview of a client’s revenues and projects

TimesheetSpecial timesheet for an EU project

Uncharged ServicesAn automated monthly report

Summing it up

What to take home

Versatile Tool

Connecting Tools

dkd offers R Alchemy

Questions

The one questionCan you turn lead into gold?

Chrysopoeia: yes, we canChrysopoeia, the artificial production of gold, is the symbolic goal of alchemists. Such transmutation is possible in particle accelerators or nuclear reactors, although the production cost is currently many times the market price of gold.

Thank you for listening

www.dkd.dedkd Internet Service GmbH . Kaiserstraße 73 . 60329 Frankfurt am Main

References“Data Alchemy: Turn your Data into Gold”

Author: Søren SchaffsteinDate: 2016-10-27

References

References

[8] Dynamic Periodic Table. URL: http://www.ptable.com/ (visitedon 10/24/2016).

[9] Excel specifications and limits - Excel. URL: https://support.office.com/en-US/article/excel-specifications-and-limits- 1672b34d- 7043- 467e- 8e27- 269d656771c3 (visitedon 10/19/2016).

[13] freeboard - Dashboards For the Internet Of Things. URL: https://freeboard.io/ (visited on 10/24/2016).

[19] Libelium. Geiger Counter - Radiation Sensor Board for Arduinoand Raspberry Pi. URL: https://www.cooking- hacks.com/documentation / tutorials / geiger - counter - radiation -sensor-board-arduino-raspberry-pi-tutorial (visited on10/24/2016).

[25] R (programming language). Page Version ID: 744362604. Oct. 14,2016. URL: https : / / en . wikipedia . org / w / index . php ?title=R_(programming_language)&oldid=744362604 (vis-ited on 10/24/2016).

[41] Synthesis of preciousmetals. Page Version ID: 744728054. Oct. 17,2016. URL: https : / / en . wikipedia . org / w / index . php ?title=Synthesis_of_precious_metals&oldid=744728054(visited on 10/18/2016).

Images

[1] Alan Levine. Network | Electric junctions on the Pine AZ SeniorThrift Ce… (Slide: Connecting Tools). URL: https://www.flickr.com/photos/cogdog/4317096083/ (visited on 10/26/2016).

[2] Alexander. pocket army knife (Slide: Versatile Tool). URL: https://de.fotolia.com/id/956509.

[3] Andras Kovacs. Freepik | gold Photo (Slide: Chrysopoeia: yes, wecan). URL: http://www.freepik.com/index.php?goto=41&idd=39995&url=aHR0cDovL3d3dy5zeGMuaHUvcGhvdG8vMjI3OTg3#(visited on 10/24/2016).

Seite 1/1

[4] Andreia. 700 years (Slide: Package Repository). URL: https://www.flickr.com/photos/iseecat/15045038073/ (visited on10/24/2016).

[5] Brian Shamblen. 24 Hours of LeMons (Slide: It’s quite fast). URL:https://www.flickr.com/photos/23972840@N04/11283383556/(visited on 10/24/2016).

[6] Brooke’s Bargains. Fisher-Price HandyManny Ripp Chain Saw (Slide:Comparing Excel to R). URL: http://www.brookesbargains.com/handy-manny-rip-chainsaw-and-flicker-flashlight-only-4-99-each/.

[7] Dan Backman. Apothecary | Aria, North Beach, SF (Slide: dkd of-fers R Alchemy). URL: https : / / www . flickr . com / photos /dbackmansfo/4716004445/ (visited on 10/26/2016).

[10] eynermedia. Chemistry | Alchemist laboratory interior (Slide: Pre-sentation Title). URL: https : / / www . flickr . com / photos /89228431@N06/11080396405/ (visited on 10/24/2016).

[11] frau-Vogel. jemand hatte ... (Slide: The one question). URL: https://www.flickr.com/photos/frau-vogel/3773399470 (visitedon 10/24/2016).

[12] Freeboard. freeboard - Dashboards For the Internet Of Things(Slide: Freeboard). URL: https://freeboard.io/ (visited on10/24/2016).

[14] Freepik. Blue alchemy symbols badges (Alchemy symbols in back-ground). URL: http://www.freepik.com/free-vector/blue-alchemy-symbols-badges_850849.htm (visited on 10/12/2016).

[15] Freepik.Hand drawn alchemy elements (Alchemy symbols in back-ground). URL: http://www.freepik.com/free-vector/hand-drawn-alchemy-elements_849702.htm (visited on 10/24/2016).

[16] Henry Burrows. Ulsan Express (Slide: Can process a lot of data).URL: https://www.flickr.com/photos/foilman/15587276942/(visited on 10/24/2016).

[17] Joan Kimball. Tasche Uhren in einem Bündel Stockfoto (Slide:Time Records). URL: http : / / www . istockphoto . com / de /foto/tasche-uhren-in-einem-b%C3%BCndel-gm110908324-1782503?st=_p_1782503.

[18] Joshua Lyon. Microsoft Excel Error Message (Slide: Excel has it’slimitations). URL: http : / / boshdirect . com / Blogs / Tech /microsoft-excel-error-message.html (visited on 10/19/2016).

[20] Libelium. Geiger Counter - Radiation Sensor Board for Arduino(Slide: Libelium Geiger Counter Shield). URL: https : / / www .cooking- hacks.com/documentation/tutorials/geiger-counter- radiation- sensor- board- arduino- raspberry-pi-tutorial (visited on 10/24/2016).

[21] Mathias Pastwa. social network hub (Slide: Slideshare). URL: https://www.flickr.com/photos/mpastwa/2671066786/ (visited on10/12/2016).

Seite 2/1

[22] MS880 - Top-Modell: Imposante 6,4kW-Hochleistungssäge (Slide:Comparing Excel to R). URL: http://www.stihl.de/produkt.aspx?idModel=658&idMarketingGroup=1582&realurl=/STIHL-Produkte/Motors%C3%A4gen-und-Kettens%C3%A4gen/S%C3%A4gen- f%C3%BCr- die- Forstwirtschaft/2658- 1582/MS-880.aspx (visited on 10/19/2016).

[23] NASA Goddard Space Flight Center. Antares Rocket With CygnusSpacecraft Launches (Slide: This is only the beginning!)URL: https:/ / www . flickr . com / photos / gsfc / 9807812154 (visited on11/27/2014).

[24] Quinn Dombrowski. Beer sampler (Slide: Questions). URL: https://www.flickr.com/photos/quinndombrowski/5200218267/(visited on 10/24/2016).

[26] Rafael Araujo. Rafael Araujo - Calculation 20 (Slide: Product Spec-ifications). Sept. 10, 2014. URL: https://www.flickr.com/photos/eager/15009713669/ (visited on 10/19/2016).

[27] Rob Shenk. Tools of the Trade (Slide: RStudio). URL: https://www.flickr.com/photos/rcsj/8060829057/ (visited on 10/24/2016).

[28] Søren Schaffstein. Climbing (Slide: Climbing).

[29] Søren Schaffstein. Donut (Slide: Photography).

[30] Søren Schaffstein. Glass of Sweets (Slide: Estimations).

[31] Søren Schaffstein. Gold Coins (Slide: Accounting).

[32] Søren Schaffstein. Half-Pint Heroes (Slide: Board Game Author).

[33] Søren Schaffstein. Illuminated Bicycle (Slide: Travelling).

[34] Søren Schaffstein. Pizza (Slide: About dkd).

[35] Søren Schaffstein. Protected Place (Slide: Disclaimer).

[36] Sebastiaan ter Burg. Left channel mono - Stereo - right chan-nel mono selector p… (Slide: OpenCPU). URL: https://www.flickr . com / photos / ter - burg / 14831362160/ (visited on10/24/2016).

[37] Staffan Scherz. Art of Transportation (Slide: What to take home).URL: https://www.flickr.com/photos/staffanscherz/6161284551/ (visited on 10/22/2016).

[38] Steve Jurvetson. Civil Defense (Slide: Geiger Counter). URL: https://www.flickr.com/photos/jurvetson/7599588998/ (visitedon 10/12/2016).

[39] Sweet Chili Arts. Open source free culture creative commons cul-ture pioneers (Slide: Open Source). Jan. 18, 2012. URL: https://www.flickr.com/photos/74611013@N02/6721910825 (vis-ited on 11/26/2014).

[40] sxc. Pot of gold (Slide: Thank you for listening). URL: http://www.freepik.com/free-photo/pot-of-gold_633785.htm(visited on 10/24/2016).

[42] Tara Schmidt. Ice (Slide: Cool in R). URL: https://www.flickr.com/photos/taramarie/16012526920/ (visited on 10/21/2016).

Seite 3/1

[43] taymtaym. Girl in Latex Suit (Slide: LaTeX). URL: https://www.flickr.com/photos/taymtaym/13663386063.

[44] twitter.com/mattwi1s0n. Internet Email (Slide: E-Mail). URL: https://www.flickr.com/photos/piccadillywilson/68766132(visited on 10/26/2016).

All trademarks, trade names, product names and logos appearing inthis presentation are the property of their respective owners, includingin some instances dkd. Any rights not expressly granted herein arereserved.

Seite 4/1