Baobab, Inc. · Baobab conducts interviews with clients to understand the type of model they have...
Transcript of Baobab, Inc. · Baobab conducts interviews with clients to understand the type of model they have...
Baobab, Inc.https://baobab-trees.com/en/
Quality, speed, cost effectiveness. We are born in Japan.
Our Services
Clean, non-contaminated data collection• We collect data on demand through our Baopart network (trained workers network)
in Japan, South Asia, USA and Europe.
High quality manual annotation and translation• We continuously improve annotation/translation quality based on feedback
from clients and Baoparts.• Independent reviewers check all annotations/translations.
Speedy turnaround with outstanding output• We optimize workflows using our proprietary tools.
Baobab, Inc.https://baobab-trees.com/en/
Our in-House ToolsTools are used by Baoparts to provide quality output
Baobab, Inc.https://baobab-trees.com/en/
Moringa-i
• Image collection(jpeg format; image collection available for most countries)
• Adding tags/captions to collected images• Adding tags/captions to images submitted by clients• App is available in Japanese, English and Chinese
(simplified and traditional)*Tags in English only
• Collected image data in jpeg format• Tag/caption data delivered in CSV format
Mobile app that allows Baoparts to capture, tag and caption images on the go
Screenshot of project list in the app
Moringa-v
• Sound collection(WAV format; sound collection available for most countries)
• Adding tags to collected sounds• Adding tags/captions to images submitted by clients• App is available in various languages, such as Japanese,
English and Chinese*Tags are in English only
• Sound data is in WAV format• Tag data is delivered in CSV format
Mobile app to collect and tag sounds on the go
Screenshot of project list in the app
Image and Video Captioning
• Up to 8 Baoparts may simultaneously work on the tool to caption a photo or video,resulting in up to 8 captions per photo or video.
• Can create captions in multiple languages,such as English and Chinese, in addition to Japanese.
• Data is delivered in CSV format.
Web-based image and video captioning tool
Pose AnnotationWeb-based tool to bound objects (box or polygon), add keypoints and tag them.
Pose Annotation work screen
An example of an object in an image being marked up, enclosed in a rectangle region and transcribed.
Pose Annotation Examples
Keypoints + Segmentation
Keypoints + Bounding Boxes
Keypoints only
Bin Box AnnotationWeb-based tool to enclose objects of interest and annotate picking points
Bin Box Annotation Tool Work Screen
Semantic SegmentationWeb-based tool to segment objects and classify them to client-defined categories
We have experiences in annotating various objects in the world.
- Traffic (road, sign, vehicle, human, ...)
- Vehicle parts (front door, rear door, front bumper, ...)
- Meteorological objects (cloud, sky, …)
- Terrestrial objects (mountains, rivers, sea, ...)
Baobab, Inc.https://baobab-trees.com/en/
Client Support and Quality Control
For Better Quality Data
Factors to consider when building learning data1 Necessary to be in line with the model
2 Clarification of the type of learning data to be constructed
Important in order to construct highly accurate data
Baobab conducts interviews with clients to understand the type of model they have in mind, and assists in deciding specifications of data
Work ManagementAnnotation work is done by teams residing in Japan, Vietnam, Thailand, China, Indonesia, Macedonia, and English-speaking countries.
Each team is lead by a Baopart Leader, who manages and trains Baoparts
These teams are supervised by a Baopart Captain who manages the entire project.
Baopart Captain coordinates with Baopart Leaders for smooth execution of the project.
Baopart Captain manages progress and Baopart training, may also check Baoparts’ work directly, and will field questions regarding the project
Baoparts in 15+
countries
900+ registered Baoparts
Work Management Our Workflow
BAOPART CAPTAINSets Guidelines, Manages Progress, Trains Baoparts, Checks Data
BAOPART LEADERTranslates Guidelines, Trains Baoparts,
Checks Data
BAOPART BAOPART BAOPART BAOPART BAOPART BAOPART
BAOPART LEADERTranslates Guidelines, Trains Baoparts,
Checks Data
Baoparts achieve overwhelming efficiency compared to our competitors.
Competitor C(tag)
Baobab (tag & annotation)
Baobab(tag)
number of images 9,500 75,657 6,000
number of tasks 14,000 tags 514,468 tags & annotations
39,219 tags
work period 150 days 40 days 3 days
speed 93 tags/day 12,862 tags & annotations / day
13,073 tags/day
number of workers 730 45 3
Baopart Work Process
Track Record - Our Image Annotation Work
Work Tool work load
work durationexcluding guideline prep
and double checking period
People and Vehicle with Bounding Box + Labelling Baobab Pose Annotation Tool
I: 80,000B+L: 513,191
35 days
Street signs with Bounding Box Baobab Pose Annotation Tool
I: 9,908B: 12,681
3 days
Automobile Parts with Image Collection Moringa-i I: 1,000 24 days
Food Items with Labelling Excel I: 1322 5 days
Monkeys with Bounding Box(Polyline)+Keypoint Baobab Pose Annotation Tool
B+K:10,000 9 days
Video (6 sec) with Captioning supplied by client
400,000 120 days
☆duration is subject to work difficulty (precision of target, visibility of image, amount of information regarding rules
I: Image B: Bounding Box L: Label K: Keypoint
Text and NLP Related Data
Multilingual Shopping Scenario Creation
Creation of conversations in a shopping situation with sales person and tourists from abroad, and verification test of the scenario
120 scenes
Emotion Annotation Emotion annotation on sentences and paragraphs 6,000 annotations
Captioning Japanese captioning on 6 second videos, the largest library in the world exceeding Microsoft’s 260,000 captions. It is also a first in Japanese language.
400,000 captions
Speech Data Annotation Annotation on speech retrieval intention data 4,750 annotations
Casual Conversation Data Set
Sales of domestic life conversations data set 5,000 sentences
Tweet Labelling Labelling Japanese tweets 10,000 tweets
Dialogue Data Creation and Annotation
Creating Dialogue scenarios as data and annotation and labelling them
100,000 sentences
Dialogue Scenario Creation
Japanese-English dialogue scenario creation for machine translation
30,000 sentences
We also compile large-scale corpus for machine translation to NTT, NICT, University of Tokyo and many other clients
Found in 2010, we created mass translation data for machine translation engines, as well as machine translation engine evaluations. We now provide many text annotations and multilingual scenario creations for our clients.
Baobab’s Strength (1)1 Thorough Client Communication and Need Assessment
▷ We communicate with you to assess precise needs. ・ We interview clients directly to assess needs. This is indispensable for high-precision data creation. ・ Allows for tailored specification creation for learning data, and work-rule compilations. ・ Flexible with change in requests. We update guidelines and rules as needed during work.
2 High Quality Workforce - Baoparts
▷ Baoparts understand task specifications. ・ Mandatory for Baoparts to read and understand rules before starting work. ・ We run trials to assess Baoparts’ understanding, and give feedback to hone in client’s need. ・ They are trained experts in annotation work.
▷ Baoparts receive instant helps from Captains ・ Baopart Captain is on hand to liaise between clients and Baoparts to answer any questions. ・ Q&As that are related to work are shared with all Baoparts to ensure consistent quality
▷ Baoparts are global. Japan, Vietnam, Thailand, China, Taiwan, US, UK and other countries.
▷ We review all data before delivery. ・ Leaders of Baoparts and Baopart Captains are responsible for it.
Baobab’s Strength (2)3 Speed, Accuracy and Volume Output Leading to Cost Effectiveness
4 Investment in In-House Tools
5 Accept Small Volume Orders, Client Tailoring
▷ We manage the entire process allowing for efficient work and volume output.・Our clients can focus on “what we need”. We take care of how you get it.
▷ We manage Baoparts, not just assigning tasks and hoping it will work.・We carefully control the process and give incentives and rewards to them.・We believe high motivation leads to high quality and less cost.
▷ We listen to you and Baoparts.・Offer tools that match your desired data compilation.・We use feedback from Baoparts to improve the in-house tools continuously.
▷ We cater to requests such as testing with a small amount of data before creating large-scale data.
Baobab, Inc.https://baobab-trees.com/en/
Diversity and Inclusion in Baobab
Baopart Diversity and InclusionDisability Employment Support Organization in Aomori, Japan
“The atmosphere changed drastically since we started annotation work. The workplace is much more positive and cheerful.”- Testimonial from instructors
Baobab, Inc.https://baobab-trees.com/en/
Our Clients
"I have asked Baobab to create data for my research many times, and I really appreciate their willingness and flexibility in responding even to slightly unusual requests. I thoroughly recommend them."
Graham NeubigAssistant ProfessorLanguage Technology InstituteCarnegie Mellon University
What our client is saying
Our Clients• Toyota Motors Corporations
• Carnegie Mellon University
• Panasonic System Networks Co., Ltd.
• Hitachi Ltd.
• Preferred Networks, Inc.
• The University of Tokyo
• NTT Communication Science Laboratories
• LIXIL
• National Institute of Information and Communications Technology (NICT)
• Yahoo Japan Corporation
• Chiba Institute of Technology
… and many more!
Miori Sagara President and CEO, Baobab, Inc.Ms. Sagara found Baobab in 2010. She developed a unique cloud translation platform utilizing machine translation to support translation work.
Ms. Sagara was affiliated with the National Institute of Information and Communications Technology (NICT) from 2011. This institute is the largest research center in Japan that focuses on Universal Communication including machine translation research. At NICT, Ms. Sagara lead a team to develop “Koetora”, an application that supports communication between the hearing impaired and able bodied. The app uses voice recognition / speech synthesis technologies. She also worked on implementing research findings into corporations such as DoCoMo and Narita Airport.During her tenure with NICT, she was awarded the Nagao Prize in 2013 by Asia-Pacific Association for Machine Translation (AAMT), and received a “performance excellence award as an individual” in 2012 from NICT.
Active in the communications industry Ms. Sagara is a delegate of the Association for Natural Language Processing, member of Global Communications Development Promotion Council (part of NICT), past committee member for Multi-lingual Voice Translation Technology Research, Development and Implementation (part of Ministry of Internal Affairs and Communications) and past working group member of Next Generation Artificial Intelligence implementation (part of Ministry of Internal Affairs and Communications)
Baobab, Inc.https://baobab-trees.com/en/
With our extensive experience, Baobab has the ability to tailor to your specific needs. We are accommodating to unconventional requests and are flexible in meeting the output you require.
We are happy to start a dialogue with you to assess your needs.
Please contact us at [email protected]!