WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work...

16
1

Transcript of WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work...

Page 1: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

1

Page 2: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Table of ContentsPlanned dates of our milestones and deliverable under SGA-2.............................................................3

Work plan of SGA-1................................................................................................................................5

Work plan of SGA-2................................................................................................................................8

Population - Everyday citizen satisfaction........................................................................................10

Aim and description:.............................................................................................................10

Detailed work plan for SGA-2...............................................................................................10

AGRICULTURE – Estimation of Agricultural statistics – pilot case study on crop types based on satellite data.....................................................................................................................................12

Aim and description:.............................................................................................................12

Detailed work plan................................................................................................................12

TOURISM/BORDER CROSSING – Border movement.........................................................................14

Aim and description:.............................................................................................................14

Detailed work plan................................................................................................................14

2

Page 3: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Planned dates of our milestones and deliverable under SGA-2

Deliverable:

The general report for each case study/domain including recommendation on legal aspects, availability and sustainability, methodology, quality and technical requirements – (month 17) - 31.05.2018. 1st  May 2018 – First submission to Review Board (RB) 1st -15th  May 2018 Document analysis by RB 15th -22nd   May 2018 Overview of comments and document analysis            22nd  May 2018 Sending improved document to RB 22nd -31st  May 2018 Time to finish work on deliverable

Milestones:

Progress and technical report of first internal meeting – (month 7) – 31.07.2017 30th  June 2017 – First submission to RB 30th June – 17th July 2017 - Document analysis by RB 17th -24th  July 2017 Overview of comments and document analysis               24th  July 2017 Sending improved document to RB 24th -31st  July 2017 Time to finish work on milestone

List of potential pilots and domains with successful implementation potential for further elaboration in the second wave of pilots in 2018 – (month 15) – 30.03.2018 1st  March 2018 – First submission to RB 1st -15th  March 2018 Document analysis by RB 15th -22nd   March 2018 Overview of comments and document analysis        22nd  March 2018 Sending improved document to RB 22nd -30st  March 2018 Time to finish work on milestone

3

Page 4: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

4

Page 5: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Work plan of SGA-1

Task 1. Data availability/Data inventory

TASK FROM TO

1. Identify Big Data sources taking into account sustainability and availability in several countries. 2016-03-01 2016-07-15

1.1. Establishing an inventory of these sources by: 2016-03-01 2016-06-08

1.1.1. Brainstorming - a review of potential sources, including the 2015 UNSD Survey on the Use of Big Data for Official Statistics 2016-03-01 2016-05-19

1.1.2. Preparation of a questionnaire with questions about the sources used by the project participants 2016-04-29 2016-05-30

1.1.3. Sending the questionnaire to participants 2016-05-30 2016-05-31

1.1.4. Gathering answers and preparation for analysis 2016-05-31 2016-06-08

MILESTONE 1: Progress and technical report of internal WP-meeting 2016-05-24 2016-05-31

1.2. Assessment of the possibility of using sources for Big Data analysis in the domains of population, tourism/border crossings, agriculture 2016-06-01 2016-06-24

Face to face meeting Tallinn 2016-06-13 2016-06-15

WP6&WP7 internal meeting in Warsaw 2016-06-28 2016-06-30

1.3. Build the list of potential sources 2016-06-24 2016-07-15

2. Identify which results or new products from the source-oriented pilots may contribute to these domains 2016-07-15 2016-08-26

2.1. Match the sources from the list of potential sources to each domain 2016-07-15 2016-07-22

2.2. Preliminary analysis of possibility for using sources to each domain (legal aspects, availability, methodology, IT, quality) 2016-07-22 2016-08-12

2.3. Build the list of exploitable sources for each domain 2016-08-12 2016-08-26

3. Describe the added value of delivered linkage between 2016-08-26 2016-09-30

5

Page 6: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

these sources to current statistics

3.1. Analyze the list of exploitable sources for each domain 2016-08-26 2016-09-09

3.2. Prepare the map of linkages between Big Data sources (e.g which aspect of one data source can be used in several domains) 2016-09-09 2016-09-20

3.3. Describe the added value for each domain 2016-09-20 2016-09-30

MILESTONE 2: List of available Big Data sources in the domain(s) 2016-09-30 2016-09-30

Task 2. Data feasibility

TASK FROM TO

1. Carry out explorative analyses in order to apply Big Data sources in the domain of population, tourism / border crossings or agriculture 2016-10-01 2016-12-30

1.1. Selection the most value big data sources for each domain (evaluation of the legal aspects, availability, methodology, IT, quality) 2016-10-01 2016-10-31

1.2. Analyzing results. 2016-11-02 2016-11-30

Face to face meeting Brussels 2016-11-17 2016-11-18

1.3. Preliminary assessment of the usefulness - developing the assessment factors. 2016-12-01 2016-12-30

2. Selection and recommendation of two or three big data sources for using in the domain of population, tourism / border crossings, agriculture. 2017-01-02 2017-02-28

2.1. Preparing the SWOT analysis (positive and negative factors of using several sources) 2017-01-02 2017-01-16

2.2. Recommendation of the most important and useful sources. 2017-01-16 2017-01-31

MILESTONE 3: Recommendation for using two or three Big Data sources in the domain(s) 2017-01-31 2017-01-31

SGA-1 deliverables - report for each domain 2017-02-01 2017-02-28

Submission to the Review Board 2017-02-01 2017-02-01

6

Page 7: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Document analysis by the Review Board 2017-02-01 2017-02-15

Overview of comments and document analysis 2017-02-15 2017-02-28

Send the document to the Coordinator 2017-02-28 2017-02-28

Face to face meeting Sofia + workshop 2017-02-23 2017-02-24

7

Page 8: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Work plan of SGA-2

Description of the work package 7

Aim of this work package is to investigate how a combination of Big Data sources and existing official statistical data can be used to improve current statistics and create new statistics in statistical domains. The work package is focused on the statistical domains: Population, Tourism/border crossing and Agriculture. The work package team will describe the data collection, data linking, data processing and methodological aspects when combining data in statistical domains. Challenges ahead are: representativity issues, linking datasets, metadata, international comparability, long lasting solutions with sustainable cost.

The main WP7 activities under SGA-2 will concern lessons learned from three use cases and working on concrete and practical examples of Big Data sources. Additional output is identifying and sharing good practices on using Big Data sources and combining different types of data sources.

In SGA-1 WP7 carried out the following tasks as very important input to SGA-2:

Task 1. Data availability/Data inventory (phase 1)

- Identify Big Data sources taking into account sustainability and availability in various countries;

- Identify which results or new products from the source-oriented pilots may contribute to these domains;

- Describe the value added of delivered linkage between these sources to current statistics.

Task 2. Data feasibility (phase 2)

- Carry out explorative analyses in order to apply two or three Big Data sources in the domain of population, tourism/border crossing or agriculture;

- Selection and recommendation of two or three Big Data sources for using in the domain of population, tourism/border crossing, agriculture.

Under SGA-2 to achieve the main goals, WP7 starts experimental work. For this reason WP7 is planning to carry out the three following case studies:

I. Population - Everyday citizen satisfaction.II. Estimation of Agricultural statistics – pilot case study on crop types based on satellite

data.III. TOURISM/BORDER CROSSING – Border movement.

The pilots should give input to complete the following tasks:

Task 3. Data combination (phases 3 and 4)

1. The experimental work: Data collection; Data preparation; Data analysis.

8

Page 9: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

2. Describe practical, technical and methodological aspects when combining Big Data outputs within the statistical system. For example, differences in definition, populations and volatility etc.

3. Provide first answers on quality issues when combining Big Data with traditional outputs.

4. Provide answers on the question whether microdata have to be used when combining Big Data estimates with traditional outputs or data at aggregated level can be considered.

- Analysis of advantages and disadvantages of combining various datasets;

- Preparing the list of criteria for combining data.

Description DateFinalizing pilots from SGA-1 – extending the scope data sources

14 April 2017

Identification all potential data sources to combine within domain

21 April 2017

Applying pilot survey to combine data (2-3 data sources)

May - August 2017

Writing conclusions 30 May 2017Meeting in Warsaw – ideas of inter-domain data combination

June 2017

Task 4. Summary plus future perspectives (phase 5)

Suggest pilots and domains with successful implementation potential for further elaboration in the second wave of pilots in 2018.

- Recommendation on legal aspects;

- Recommendation on availability and sustainability;

- Recommendation on methodology;

- Recommendation on quality;

- Recommendation on technical requirements.

Description DateIdentification all potential data sources to combine inter domain

31 July 2017

Applying pilots for inter-domain data combination

30 September 2017

Preparing the methodology for future perspectives

30 November 2017

Writing conclusions for the milestone

28 February 2018

9

Page 10: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Joint meeting WP6 and WP7 – sharing ideas

Q1 2018

Writing conclusions for the report 30 April 2018

10

Page 11: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Population - Everyday citizen satisfaction.

Aim and description:1.1. Responsibility: PL – coordinator, supported by UK, PT 1.2. Data sources: Social media/Blogs/Internet portals1.3. Methodology: Webscraping, Data/Text/Web mining, Machine learning. As

the data sources are selective, i.e. only cover units that put text on social media and the internet, the methodology will aim at yielding valid information for the population as a whole. Use will be made of methods described in the literature (such as research on the use of public social media messages done in the Netherlands).

1.4. The goal of the case study: - to examine the level of daily satisfaction of the population by analyzing the content of messages for the presence of defined expressions describing emotional states, e.g., happiness, joy, sadness, fear, anger;- to present the moods of the population associated with various public events;- to observe morbidity areas, e.g., flu.

1.5. Plan of Combining Datasets: Combine in one repository the selected data from all Big Data sources, Comparison with the results of social studies to add more detailed information, Supplement of information gained in social studies.

1.6. Main benefits and value added for official statistics: Support traditional European Social Survey, supplement the research methodology of some phenomena that are difficult to measure through traditional polls.

Detailed work plan for SGA-2Description DateFinalizing pilots from SGA-1 – extending the scope data sources

14th of April 2017

Identification all potential data sources to combine within domain

21st of April 2017

Applying pilot survey to combine data (2-3 data sources)

12th of May 2017

Writing conclusions and report 30th of May 2017Meeting in Warsaw – ideas of inter-domain data combination

June 2017 (exact date in the next 2 weeks)

Identification all potential data sources to combine inter domain

31st of July 2017

Applying pilots for inter-domain data combination

30th of September 2017

Preparing the methodology for future perspectives

30th of November 2017

Writing conclusions for the milestone

28th of February 2018

Joint meeting WP6 and WP7 – sharing ideas

Q1 2018

Writing conclusions for the report 30th of April 2018

11

Page 12: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

AGRICULTURE – Estimation of Agricultural statistics – pilot case study on crop types based on satellite data.

Aim and description:2.1. Responsibility: PL – coordinator, supported by IE.2.2. Data sources: Satellite images, administrative data, in situ surveys.2.3. Methodology:

- combining data – data fusion on radar and optical remote sensing data;- data comparison with traditional surveys e.g. FSS;- combining data – administrative data sources with satellite data.

2.4. The goal of the case study: Crop type: look at the types of crops being grown and see if we can tell this accurately from the imagery; analysis of possibilities of using satellite images.

2.5. Plan of Combining Datasets: Data fusion – combining data sources by spatial reference.

2.6. Main benefits and value added for official statistics: Increase the quality of the agricultural surveys; Decrease of respondents burden; More detailed data published by official statistics; Potential decrease of the cost of conducting surveys.

Detailed work plan

Description DateFinalizing pilots from SGA-1 – extending the scope data sources

14th of April 2017

Identification all potential data sources to combine within domain

21th of April 2017

Applying pilot survey to combine data (2-3 data sources)

May - August 2017

Writing conclusions and report 30th of May 2017Meeting in Warsaw – ideas of inter-domain data combination

12-13 June 2017

Identification all potential data sources to combine inter domain

25st of August 2017

Applying pilots for inter-domain data combination

30th of October 2017

Preparing the methodology for future perspectives

30th November 2017

Writing conclusions for the milestone

28 February 2018

Joint meeting WP6 and WP7 – sharing ideas

Q1 2018

Writing conclusions for the report 30th of April 2018

12

Page 13: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

TOURISM/BORDER CROSSING – Border movement.

Aim and description:3.1. Responsibility: PL – coordinator, supported by NL and PT.3.2. Data sources: Traffic sensors (data already acquired from Polish, Lithuanian,

Slovak and German data owners), traditional surveys on tourism, flight statistics such as origin, destination, estimation of number of passengers from Civil Aviation Authority of the Republic of Poland and webscraping. Depending on availability, Mobile Call Data will also be used, building on the results of WP 5.

3.3. Methodology: - spatial-temporal models, spatial and graph interpolation methods;- cross-entropy econometrics for combining data sets.

3.4. The goal of the case study: to estimate border traffic through internal border of EU (Polish-German, Polish-Slovakian, Polish-Czech and Polish-Lithuanian border) also regarding to some mirror statistics. Partial estimation of domestic traffic may be an extra result. Selected data sources from national authorities show the scale of border movement that is regarded as tourism in terms of statistical surveys.

3.5. Plan of Combining Datasets: - Unifying structure of data sets; - Collecting exogenous variables (road class, number of registered vehicles

with respect to type etc.);- Preparing distance and graph matrices;- Quantifying reliability of each data source (expected standard error and

possible bias);- Combining traffic data from different sources with cross-entropy

econometrics.3.6. Main benefits and value added for official statistics: Decreased burden of

interviewers, more detailed results than from the survey solely, data consistent with mirror statistics, reduced time of data production.

Detailed work plan Description DateFinalizing pilots from SGA-1 – extending the scope data sources

7th April 2017

Identification all potential data sources to combine within domain

March 2017 – April 2017

Applying pilot survey to combine data (2-3 data sources)

March 2017 – December 2017

Writing conclusions and report May -June2017Meeting in Warsaw – ideas of inter-domain data combination

12-13 June 2017

Identification all potential data sources to combine inter domain

31 July 2017

Applying pilots for inter-domain data combination

30 September 2017

13

Page 14: WP7 Multi domains - Europa€¦ · Web viewGeneral work plan2017/2018WP7 Multi domains General work plan2017/2018 WP7 Multi domains Table of Contents Planned dates of our milestones

Preparing the methodology for future perspectives

30 November 2017

Writing conclusions for the milestone

28 February 2018

Joint meeting WP6 and WP7 – sharing ideas

Q1 2018

Writing conclusions for the report 30 April 2018

14