Indonesia Open Data Initiative - Kofera Technology

10
Indonesia Open Data Initiative

Transcript of Indonesia Open Data Initiative - Kofera Technology

Indonesia Open Data Initiative

Opportunity

• Indonesian public's enthusiasm on research in the field of technology are on the rise.

• There are many researchers in Indonesia, especially in the field of technology.

• Indonesia is a country that have variety of cultures and languages that can be used as research material.

Background: Mars Rover by Wiki Images

Problems● Conducted research only used on a limited circle such as

companies, institutions, etc.

● Publication is done only in the form of paper that basically is not useful for everyday life.

● Data and algorithms are not open to the public so that other researchers can not continue the previous studies.

● No data standardization in term of machine learning and artificial intelligence research.

● Lack of good national publication center for data science.

● Lack of financial support for applied research that has been done.

Goals & ObjectivesGoals1. Build research environment and culture, especially in the field of

technology in Indonesia

2. Build standardization of research data

3. Build Indonesia’s center of data science & publication

Objectives1. Build Indonesia’s center of open data that can be used by individual,

community, institution and company to get and share research data

2. Build training data standardization

3. Open sources data & module/algorithms access

4. Community and study groups for researchers

Resources

➔ Researchers

➔ Data

➔ Algorithms/codes/models

➔ Community and study groups

➔ Financial support

Contributor RoleInstitution

Corporate

Government

Community

Data providerReviewer

Data providerConsumer

DonatorResearcher

ResearcherConsumerStandardization

LegalizatorConsumer

ProtectorDonatorData provider

ConsumerData providerResearcher

Institution

Corporate

Government

Community

Data, Algorithm, Library, Model

Standardization

Data, Algorithm, Library, Model

Review Forum

Publication

Lifecycle

Pre-Publication

Consumption by Institutions

Activities Plan❏ Data acquisition (crowdsourcing)

❏ Data standardization (crowdsourcing)

❏ Continuous model research & development

❏ Continuous model standardization

❏ Weekly or monthly meetup

❏ Forum discussion

❏ Collaboration research & paper

❏ Events: Industry gathering, new research presentation, etc.

Data Acquisition

TEXT1. WORDNET Bahasa Indonesia2. Corpus Bahasa Indonesia3. Stopword Bahasa Indonesia4. Translation data (Bahasa - other language)5. Annotation data for Part of Speech (POS) Tagging 6. Sentiment Analysis data7. Question-Answering (QA) data for some domain (medical domain, etc.)8. Etc.

SOUND1. Text to Speech data (Bahasa Indonesia)2. Speech translation data (Bahasa - other language)3. Music data (traditional & modern)4. Sound from traditional musical instrument data5. Tone emotion recognition data6. Audio classification data7. Etc.

IMAGES1. Images to text database with label (concept-based image processing)2. Indonesian cultural images data (such as batik, traditional dress, etc.)3. Indonesian herbal plants images data4. handwriting images on traditional languages data5. Indonesia spatial data6. Biomedical data7. Etc.

Thank You

KOFERA RESEARCH & DATA SCIENCE TEAM

https://kofera.com [email protected]