Mapping Commodity Trading
-
Upload
ajquigley -
Category
Technology
-
view
547 -
download
0
description
Transcript of Mapping Commodity Trading
Mapping Commodity Trading in the 19th Century
Benjamin Bach, INRIA, Paris
Asma Malik,University of Strathclyde, Glasgow
Michael Mauderer,University of St Andrews
Sadiq Sani,Robert Gordon University, Aberdeen
Joe Wandy,University of Glasgow
Outline
● Project Overview● Data● Technology● Demo● Future Work
Overview
19th Century
Commodities Diseases
Locations Disasters
Process
Tasks
● Retrieve documents mentioning ○ Commodities○ Locations○ Time range
● Relations between retrieved terms○ Spatial relations○ Temporal relations○ Co-occurrence relations
Users:Historians
Data
● Commodities: 1067● Time: 1600 - 1952 (452 years)● Documents: 18 580● Location occurrences: 91 650 469● Commodity occurrences: 29 020 013
The Data
● PostgreSQL Database in Edinburgh○ Not accessible
● PostgreSQL Database in St Andrews○ Low Performance
● PostgreSQL Database Backup○ 2.5GB compressed binary data○ Cannot be imported into Amazon RDS
Solution 1
● Create a more compatible SQL export to import into Amazon RDS
○ 24GB raw text file containing SQL statements○ still incompatible○ hard to correct errors in a timely manner
Solution 2
● Create EC2 instance running a PostgreSQL database
○ Powerful enough○ Enough storage○ Accessible
Big Data Problems
● Simple things take a long time● Incremental finding of errors/new problems
The Pipeline
● D3 for client-side presentation● Java+SQL for server-side processing
data
Database
Web ServiceClient
Commodities, date range
Initial Sketches
Visualization
- Space and time -> Finding related terms + documents
- find related documents- what are documents talking about
- Implicit knowledge:- Co-occurrences of terms in documentsFor every commodity: 1) Get top 10 documents,2) Limit related terms to 63) Sum up co-occurrences
Demo
Future work
- Query by Location- Time diagrams for term frequency over time- Encode information in matrix cells (#doc,collection..)- Show and browse documents
- Handle big data: diseases, disasters, ..- Co-occurrences ?
Thank you for listening!