A survey of 2013 data science salary surveyâ€‌

download A survey of   2013 data science salary surveyâ€‌

of 31

  • date post

    19-Aug-2014
  • Category

    Engineering

  • view

    2.694
  • download

    5

Embed Size (px)

description

I'm Japanese. I wrote survey of O'reilly "2013 Data science salary survey" and differences of US and JP.

Transcript of A survey of 2013 data science salary surveyâ€‌

  • A survey of "2013 Data Science Salary Survey 2014/04/26 Tokyo Webmining / @showyou 1/32
  • 2013 Data Science Salary Survey http://strata.oreilly.com/2014/01/2013-data-science-salary-survey.html My Comments Mostly figures come from this survay. Abstruction 2/32
  • Agenda About me Survay Comments/ my opinions 3/32
  • About me Datamining Engineer Hadoop(Pig, Hive, Cloudera Hue) BI Tool, JIRA, Confluence, git Python, MachineLearning, NLP(Lang), (R, js, highchart, C++, Java) Status: Looking for the new job Previous: Hikarie Not an job consultant or recruiter http://about.me/showyou41 4/32
  • Summary of this paper OSS(Python, R) > Tradisional Tools(SAS, Excel) Tradisional Tools are used in relative isolation Wider variety of tools, higher salary Bigdata = higher salary 5/32
  • Respondents Atendees of Two Strata conferences (New York 2012 and Santa Clara 2013) Members&range of ages in US is -> Most respondents is 30s or 40s 6/32
  • The jobs of respondents(1) Top 10 Industories -> Startup 1/5 Median salary: Startup > Public > Private > gov 7/32
  • The jobs of respondents(2) Most respondents(56%) describe themselves as data scientists/ analysts. 8/32
  • Tool Usage 9/32
  • Tool usage SQL/RDB is Top R Python > Excel 10/32
  • Tool correlations Orange: Group Hadoop Blue: Group SQL/Excel Red: Neither 11/32
  • Tools(hadoop) 12/32
  • Tools(SQL/Excel) Not correlative 13/32
  • Median Salary vs Tools 14/32
  • Salary vs Hadoop or SQL/Excel 15/32
  • Salary & Tools 16/32
  • Comment or my opinion 17/32
  • Questionare of the categolize Orange vs Blue seems correct, but Red is doubtful e.g. JavaScript vs D3.js, VBA vs C#, Python vs Ruby, Pentaho vs Tableau,... 18/32
  • What is data scientist? What are differences of data scientist & analiyst? The definitions of data scientist in U.S. and JP are different. U.S.: Oreilly http://radar.oreilly.com/2010/06/what-is-data-science.html JP:Nikkei http://itpro.nikkeibp.co.jp/article/Keyword/20130614/485142/ Japanese often drops the side of Engineering 19/32
  • Indeed search(US) Keyword Low High mean Hadoop $60,000+ $140,000+ $81,300 Hive $60,000+ $140,000+ $80,400 SAS $50,000+ $130,000+ $72,100 Data scientist $50,000+ $130,000+ $72,000 Excel $30,000+ $110,000+ $51,200 Sun Francisco Bay area Strata survay : 50% over are Tech lead or Executive 20/32
  • Indeed search(Tokyo, JP) Keyword Low High mean Hadoop 5.00+ m Yen 13.00+ 6.27 Hive 5.00+ 13.00+ 6.68 SAS 4.00+ 12.00+ 6.13 Data scientist 4.00+ 12.00+ 5.81 Excel 3.00+ 11.00+ 4.40 21/32
  • Salary US vs JP(1$=102.5Yen) US($) JP(m Yen) JP($) US/JP Hadoop 81,300 6.27 61,200 1.33 Hive 80,400 6.68 65,200 1.23 SAS 72,100 6.13 59,800 1.21 Data scientist 72,000 5.81 56.700 1.27 Excel 51,200 4.40 42.900 1.19 22/32
  • Costs US vs JP U.S. House(Cal, Bayarea, 1Bed room, Sep 2013) $2192~ $2800 Foods $8~$12~+ip15% JP House(Tokyo, 1 Room under 30m^2, Apr 2014) 20k~150k Yen Foods 500~1500 Yen US = JP * 1.2 or 1.5 23/32
  • References http://strata.oreilly.com/2014/01/2013-data-science-salary-survey.html http://radar.oreilly.com/2010/06/what-is-data-science.html http://www.datascientist.or.jp/ http://priceonomics.com/the-rise-of-bay-area-rent-prices/ http://www.indeed.com/ http://jp.indeed.com/ 24/32
  • Appendix 25/32
  • Tool usage in Tokyo Webminig #35 All people using Excel(but I dont know whether for data mining or not). Javascript / SAS / SPSS is higher, Hadoop is lower A few Hadoop developer joined in Tokyo Webmining(They often joined Hadoop Code Reading). 26/32
  • Hive http://hive.apache.org/ SQL like language for Hadoop Convert hiveQL to map reduce when you execute hive query http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-live.html 27/32
  • R language R is a free software environment for statistical computing and graphics http://www.r-project.org/ e.g. $ R > demo(graphics) 28/32
  • Tableau http://www. tableausoftware.com/ BI Tool(Commercial) 29/32
  • Pentaho http://www.pentaho.com/ http://www.pentaho- partner.jp/ BI Tool (Free/Commercial) 30/32
  • SAS http://www.sas. com/en_us/software/s as9.html Analytics tool cf. SPSS http://www.sas.com/offices/NA/canada/en/resources/screenshot/sas-marketing- optimization-2-full.jpg 31/32
  • D3(.js) http://d3js.org/ http://ja.d3js.node.ws/ Rendering Library for JavaScript backbone.js 32/32