Megatrend and Intervention Impact Analyzer for Jobs · Impact Analyzer for Jobs Toomas Kirt On...

Megatrend and Intervention

Impact Analyzer for Jobs

Toomas Kirt

On behalf of Estonian hackathon team

Outline

Background

Data

Tools

Solution

Toomas Kirt 15 June, 2017

Background

The European Big Data Hackathon took place in Brussels

from 13 to 15 March 2017 and was organised by the

European Commission (Eurostat).

The policy question for the hackathon: How would you

support the design of policies for reducing mismatch

between jobs and skills at regional level in the EU through

the use of data?


Contributions

Innar Liiv – key ideas

and presentation

Rain Öpik –

PosgreSQL and

programming the

visualization tool

Toomas Kirt – Hadoop

and organization


Sources of this presentation


https://github.com/rainopik/eubdhack-megatrend

https://rainopik.github.io/eubdhack-megatrend/

Motivation – changes in the job market

It is found that across

the OECD countries,

on average 9 % of jobs

are automatable (Arntz,

Gregory and Zierahn

2016).


Not all jobs are lost As Lerman and Schmidt (2005)

have found around the

appearance of the first personal

computers in the mid-seventies

and 1983, computer industry

jobs in the United States grew

almost 80 percent, while total

U.S. manufacturing employment

increased by only 4 percent.

But the new jobs need new skills.


https://www.wired.com/2012/12/ff-robots-will-take-our-jobs/

The main questions What is the impact of a

megatrend or an intervention

to the labour market?

Which parts of labour market

of what country is most

vulnerable to approaching

megatrend or planned

intervention?

The main contributions The development of a

method to represent the

complex labour market

internal structure from the

perspective of occupations

sharing skills.

Developing and presenting

the prototype.


Solutions - Graph tools

The Real Difference

Between Google And

Apple


https://www.fastcodesign.com/3068474/the-real-difference-between-google-and-apple

Datasets

EURES CV and job vacancy dataset (see

http://eures.europa.eu/);

ESCO RDF, converted to relational structure suitable for

SQL (European Commission 2013);

List of Jobs Susceptible for Automation / Computerization

(Benedikt and Osborne 2017);

Occupation classifications mapping table from Occupation

classifications crosswalks - from O*NET-SOC to ISCO

(Wojciech, Autor and Acemoglu 2016)


EURES

To develop our prototype, we have used CV and job

vacancy data from EURES portal and the ESCO, the

multilingual classification of European Skills,

Competences, Qualifications and Occupations,

datasets. The EURES data consist of two datasets,

one on curriculum vitae (4.7 million lines) stuck up by

jobseekers and another on job vacancies (35 million

lines) published by potential employers.


ESCO database The ESCO system provides

occupational profiles showing the

relationships between

occupations, skills, competences

and qualifications (European

Commission 2013). The ESCO

dataset provided 65814

relationships between skills and

occupations; it contained 619

ISCO and 2950 ESCO

occupations.


https://www.slideshare.net/lod2project/esco-a-tool-to-facilitate-online-skills-matching-throughout-europe-2612011-brussels-belgium

Occupation classifications mapping

To demonstrate the visualization of this

megatrend on labour market, the list of

jobs susceptible for automation from

scientific articles was extracted and

O*NET-SOC standard was linked to

ISCO in order to link the job data with

datasets provided by European Big

Data Hackathon.


Tools


The data processing pipeline


The occupation graph was built with PostgreSQL. Data was stored in two

denormalized tables: g_link - linking similar occupations together and g_node -

annotating occupations with supply and demand data.

Counting the number of unique job seekers and vacancies by occupations and

different countries was conducted by Hive.

The ESCO classifier was originally presented in a RDF format, as a list of

semantic triples in the subject-predicate-object format and was converted to

relational structure suitable for SQL

The visualizer was designed to work without a server and all the data was

therefore converted to csv files.


Occupation graph A graph is defined by two

entities:

Node - denotes an ESCO

occupation. Each occupation

may have additional data

attributes attached to it.

Link - two nodes (occupations)

are connected when they are

similar to each other.



Occupations similarity When the number of distinct

skills that are required for both

occupations (22 for this example)

was divided by the number of

distinct skills required for the first

occupation (35), the proportion of

matching skills was then used as

a similarity measure between

these two occupations.

Only 3 most similar occupations

were taken for every occupation.



Annotating occupations with supply

and demand data Each node in the occupation graph denotes

ESCO occupation.

How this occupation will be affected by

automation or computerization, but the list of

Jobs Suspectible for Automation has originally

SOC occupation codes.

Mapping ISCO to SOC is one-to-many, which

means that some ISCO occupations (eg. 8332 -

Heavy truck and lorry drivers) are associated

with several SOC occupations (53-1031 -

Driver/Sales Workers and 53-3032 - Heavy and

Tractor-Trailer Truck Drivers) that may have

differing probabilities for automation

(respectively 0.98 and 0.79). To solve this

ambiguity, two probabilities were calculated -

maximum and average.



Graph visualization The graph visualizer is built with

d3.js

As the real-time calculation of

graph layout (the position of

every node) with d3.js may be

slow for graphs with non-trivial

structure, therefore for the

occupation graph with 2950

nodes and 8838 links the

positions of graph nodes were

pre-calculated.


https://d3js.org/

Solution The source code of the prototype

is released as open source at

https://github.com/rainopik/eubdh

ack-megatrend

The prototype is available online

at

https://rainopik.github.io/eubdhac

k-megatrend/

Preferred to use with Google

Chrome


A close-up of the occupation graph

Toomas Kirt 15 June, 2017 https://rainopik.github.io/eubdhack-megatrend/

Demand & supply imbalance The default mode (Show imbalance

unchecked) calculates the saturation

(“brightness” of the red colour) of the left

and the right half of the node on the

same scale.

Enabling the Show imbalance mode

normalizes both colours on the same

scale. This visualizes imbalance - when

the left half of the node is brighter red

compared to the right, this job has

unsatisfied demand. Conversely, a

brighter right half marks jobs with

excessive of job seekers.


https://rainopik.github.io/eubdhack-megatrend/

Conclusions

Changes in a society in information age and creation of huge

quantities of data are also creating challenges for the national

statistics offices.

There are first attempts to use big data sources and generate

new type of statistics.

With our tool we provide a new way to foresee the changes in

labour market.

For reducing the negative impact of changes we need to use

new tools and data sources to react accurately and timely to

them.


THANK YOU! [email protected]


Megatrend and Intervention Impact Analyzer for Jobs · Impact Analyzer for Jobs Toomas Kirt On...

Documents

Transcript of Megatrend and Intervention Impact Analyzer for Jobs · Impact Analyzer for Jobs Toomas Kirt On...