Lantea platform

15
Introduction to Lantea .NET Open Source Big Data Solution

Transcript of Lantea platform

Introduction to Lantea.NET Open Source Big Data Solution

What is Lantea

• Open source big data platform

• Rich ETL (Extract-Transform-Load) features

• A platform that can help Data Scientist to collect and deal with data easily

• Import data from different source is extremely easy

Highlighted features of Lantea

• A lot of different data sources on different media

• Query aggregation data via SQL

• Very easy to collect data from websites, local file systems, emails and databases

• Export data via a lot of formats and APIs

Target User of Lantea

• Data Scientists

• Marketing Analyzer

• Managers who needs BI

• Researchers

• Big data/BI Developers

• Deep Machine Learning Developers

Non-Commercial

Commercial

ResearchersData

Scientists

Big data/BI Developers

Marketing Analyzer

Open source developers

Managers who needs BI

Essential Elements of Big Data Platform

• Data/File Extraction

• Data Cleaning and Filtering

• Different ways of Analyzing data

• Real-time Processing

• Data Collection from Different Source

• Connect to Different Database Types

• Analysis Result Rendering

• Advanced Parameter Adjustment

Big Data

Extraction

Cleaning

Analysis

Data Processing

Data Collection

Parameter Adjustment

Introduction to LanteaArchitecture Design and Use Case

Third-party Projects Included

• Toxy – Data Extraction framework

• Spidey – Web Spider framework

• EQueue – Queue Implementation

• CacheAdapter – Cache Provider

• Irony – Compiler Implementation

• ServiceStack.Redis– Redis Client

• ScrapySharp – Html Parser and Selector

• Autofac – IOC Container

• Log4net – Configurable Logging System

• Datatables.js – Web Spreadsheet

• Thinkecture Identity Server - Social account integration

• Nepy – Parsers for Natural Language Processing

License Candidate

• LGPL

• Apache 2.0

• MIT

• Custom Open Source license

Architecture Design v1

Key Features

• Web Crawling Service

• Data Extraction Service

• Queue Service

• CQLR

(Common Query Language Runtime)

• Rich Formats Outputs and APIs

• Restful and ODATA support

Schedule for Lantea

Use Case 1 – Regional Manager Report Collection

Use Case 1 – Lantea Solution

Use Case 2 – Data Aggregation from Websites

Use Case 2 – Lantea Solution

– the Studio behind Lantea

Our Mission

• Re-create .NET Ecosystem

• Provide .NET-based solutions for clients

• Create something non-exist for .NET Community

• Contribute to Global Open Source Community

• Change the way human lives