A survey of 2013 data science salary survey”

Post on 19-Aug-2014

2.715 views 5 download

Tags:

description

I'm Japanese. I wrote survey of O'reilly "2013 Data science salary survey" and differences of US and JP.

Transcript of A survey of 2013 data science salary survey”

A survey of "2013 Data Science Salary Survey”

2014/04/26 Tokyo Webmining / @showyou

2013 Data Science Salary Surveyhttp://strata.oreilly.com/2014/01/2013-data-science-salary-survey.html

+My Comments

Mostly figures come from this survay.

Abstruction

Agenda

About meSurvayComments/ my opinions

About meDatamining EngineerHadoop(Pig, Hive, Cloudera Hue)BI Tool, JIRA, Confluence, gitPython, MachineLearning, NLP(Lang), (R, js, highchart, C++, Java)

Status: Looking for the new job Previous: Hikarie

Not an job consultant or recruiterhttp://about.me/showyou41

Summary of this paper

OSS(Python, R) > Tradisional Tools(SAS, Excel)Tradisional Tools are used in relative isolationWider variety of tools, higher salaryBigdata = higher salary

Respondents

Atendees of Two Strata conferences (New York 2012 and Santa Clara 2013) Members&range of ages in US is ->Most respondents is 30s or 40s

The jobs of respondents(1)

Top 10 Industories ->Startup 1/5Median salary: Startup > Public > Private > gov

The jobs of respondents(2)

Most respondents(56%)describe themselves as data scientists/ analysts.

Tool Usage

Tool usage

SQL/RDB is TopR Python > Excel

Tool correlations

Orange: Group “Hadoop”Blue: Group “SQL/Excel”Red: Neither

Tools(hadoop)

Tools(SQL/Excel)

Not correlative

Median Salary vs Tools

Salary vs Hadoop or SQL/Excel

Salary & Tools

Comment or my opinion

Questionare of the categolize

Orange vs Blue seems correct, but Red is doubtfule.g. JavaScript vs D3.js, VBA vs C#, Python vs Ruby, Pentaho vs Tableau,...

What is data scientist?

What are differences of data scientist & analiyst?The definitions of data scientist in U.S. and JP are different.U.S.: O’reilly http://radar.oreilly.com/2010/06/what-is-data-science.html

JP:Nikkei http://itpro.nikkeibp.co.jp/article/Keyword/20130614/485142/

Japanese often drops the side of Engineering

Indeed search(US)Keyword Low High mean

Hadoop $60,000+ $140,000+ $81,300

Hive $60,000+ $140,000+ $80,400

SAS $50,000+ $130,000+ $72,100

Data scientist $50,000+ $130,000+ $72,000

Excel $30,000+ $110,000+ $51,200

Sun Francisco Bay area

Strata survay : 50% over are Tech lead or Executive

Indeed search(Tokyo, JP)

Keyword Low High mean

Hadoop 5.00+ m Yen 13.00+ 6.27

Hive 5.00+ 13.00+ 6.68

SAS 4.00+ 12.00+ 6.13

Data scientist 4.00+ 12.00+ 5.81

Excel 3.00+ 11.00+ 4.40

Salary US vs JP(1$=102.5Yen)

US($) JP(m Yen) JP($) US/JP

Hadoop 81,300 6.27 61,200 1.33

Hive 80,400 6.68 65,200 1.23

SAS 72,100 6.13 59,800 1.21

Data scientist 72,000 5.81 56.700 1.27

Excel 51,200 4.40 42.900 1.19

Costs US vs JP

U.S. House(Cal, Bayarea, 1Bed room, Sep 2013)$2192~ $2800

Foods $8~$12~+tip15%

JPHouse(Tokyo, 1 Room under 30m^2, Apr 2014)20k~150k Yen

Foods 500~1500 YenUS = JP * 1.2 or 1.5

Appendix

Hive

http://hive.apache.org/SQL like language for HadoopConvert hiveQL tomap reduce when youexecute hive query

http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-live.html

R languageR is a free software environment for statistical computing and graphicshttp://www.r-project.org/

e.g. $ R > demo(graphics)

Tableau

http://www.tableausoftware.com/BI Tool(Commercial)

Pentaho

http://www.pentaho.com/http://www.pentaho-partner.jp/BI Tool (Free/Commercial)

SAS

http://www.sas.com/en_us/software/sas9.htmlAnalytics toolcf. SPSS

http://www.sas.com/offices/NA/canada/en/resources/screenshot/sas-marketing-optimization-2-full.jpg

D3(.js)

http://d3js.org/http://ja.d3js.node.ws/Rendering Library for JavaScript

backbone.js