Emerging Technology Or My view from the tea-leaves Leif Hanlen Technology Director, NICTA.
-
Upload
debra-robbins -
Category
Documents
-
view
221 -
download
0
Transcript of Emerging Technology Or My view from the tea-leaves Leif Hanlen Technology Director, NICTA.
Emerging Technology
Or
My view from the tea-leaves
Leif HanlenTechnology Director, NICTA
Conclusions• Information is the new currency
– Open data, sharing, re-use.– Documents are a symptom of data transience, not a solution.
• Cloud = Storage + Smarts + No cap-ex(*)
• New tech driven by new thinking: watch the next generation– Wearable computing: your phone is more powerful than Apollo 11.– Social: more than friends– Analytics: power, unleashed.
Heads up.• Reliable (digital) storage is cheap.
BlackBlaze, 135Tb storage $7,384 c/o BlackBlaze.com
Washington Archives blog: archives rescues damaged docs [2013]
The answer is not “we use water-proof cardboard boxes”
What is your v
alue add?
Are you open?
https://18f.gsa.gov/2015/03/19/how-we-built-analytics-usa-gov/
Who is NICTA?• Australia’s ICT research center of excellence
– “High impact research excellence”• National benefit and wealth creation
– 700 people (300+ research staff)– 5 laboratories, 22 partner Australian universities
• Over one-quarter of all the PhD graduates in ICT, in Australia, are supervised at NICTA.
5
6
We’ve launched a 15 startups in the last 10 years.
Apollo guidance computer 1966-75• 2k Memory• 2MHz x8 processor; 16bit
• 61 x 32 cm• 32 kilograms
iPhone 6 2014• 3x4M Memory• 3x 1.4GHz processor; 64bit
• Wireless internet (300Mb/s)• 13.1 x 6.7 cm• 139 grams
40 years
wikipedia
Apple
Gartner hype cycle: words of warning
What future technology?
Stores, protects and makes accessible records of enduring significance which are identified as archival resources of the Commonwealth and selected for retention as 'national archives'. http://naa.gov.au/records-management/strategic-information/responsibilities/index.aspx
National Archives Statement:
What future technology?
Stores, protects and makes accessible records of enduring significance which are identified as archival resources of the Commonwealth and selected for retention as 'national archives'. http://naa.gov.au/records-management/strategic-information/responsibilities/index.aspx
National Archives Statement:
New types of “record”New types of “access”
New types of “storage”
http://printingtheinternet.tumblr.com/
12
Meet the new content creator.
What do we know?• Likely to work for 15+ different organisations in 3+ sectors.
• Highly educated (post-graduate)
• Pre-supposes mobile collaborative technology– Has multiple “identities” online– Expects preferences to be acknowledged and acted upon
• Values individual capacity, demands support for community– Vs. support by community.
Big Too much Data
Crane and RaymondThe Permanente Journal Winter 2003 Volume 7 No.1Kaiser Permanente Institute for Health Policy
“Current medical practice relies heavily on the unaided mind to recall a great amount of detailed knowledge – a process which, to the detriment of all stakeholders, has repeatedly been shown unreliable”
All data is “big” when your ability to process it, is “small”
How much information?• 2.7 Zetabytes (1 billion terabytes) of data exist in the
digital universe today. – 235 Terabytes of data collected by U.S. Library of Congress in
April ’11– US government is investing $200 million in big data research
projects.
• By 2020, there will be 6 Tb of data for every person on the planet
1 Giga
1 Tera
1 Peta
1 Exa
x1000
x1000
x1000
1 Zeta
x1000
How fast do we create it?• “Every 2 days we create as much information as was created
from the dawn of civilisation up to 2003.” [E. Schmidt]– YouTube users upload 48 hours of new video every minute of the day.– 571 new websites are created every minute of the day.– 100 terabytes of data uploaded daily to Facebook. – Twitter sees 175 million tweets every day, and has more than 465
million accounts.
• Data production will be 44 times greater in 2020 than it was in 2009.
16http://wikibon.org/blog/big-data-statistics/
Trends in Information• Moore’s law
– Cost of storage halves every 18-24 months– Cost of processing halves every 18-24 months– Cloud reduces Cap-Ex. (and maybe Op-Ex too!)
• AWS 100Tb storage is approx. $700k vs $1m for internal. (Src: AWS)
• Personal media revolution– Consumers create and share vast amounts of data.– The data is created, stored & processed both locally and remotely.
• Knowledge & collaboration through re-use– Open data and open-source-code allows new analytics services.– Think syndication.
Enablers (phones) vs services (apps)
0 to 5billion phones in 26 years
0 to 25billion downloads in 26 months
wikipedia
Wikipedia on iPhone. And useable
• Virtual online experience– Users expect “look-and-feel”
of connectivity– Even when it’s not there
More information: google; NICTA mContext
Wearable computing
10~100Mbps
c/ Smart Project Topics
‘Wearable’ data: “Gigabit person days”
Genomics10 Gb per person1% population
Implants1 Gb per person per hour0.001% population
Clinical notes analysis2Mb per person per day 45% population
Wearables1 Mb per person per day25% population
Proteomics10 Tb per person0.01% population
Aggregate first, ask questions laterCame with install:
Beyond friends: social networking grows up• Wisdom of crowds
– Moving beyond “secondary broadcast” (eg. blog)– Crowd-source: primary source of information (vs sharing mechanism)– Social as a search (recommendation)– Social as a filter for source docs
US. National archives crowd-sourced scanning ‘10Library of Congress uses CS-error checks
OpinionWatch: summarising the crowd
NetFlix
Netflix looks at 30 million “plays” a day, including when you pause, rewind and fast forward, four million ratings by Netflix subscribers, three million searches as well as the time of day when shows are watched and on what devices. (NY Times, 2013)
NetFlix the “distributor” or perhaps the “archivist”
Recommendation engines: the new librarians?
Same tech used by • Amazon (books)• Netflix (video)• LinkedIn (jobs + applicants)
NetFlix: roll-your-own
Netflix looks at 30 million “plays” a day, including when you pause, rewind and fast forward, four million ratings by Netflix subscribers, three million searches as well as the time of day when shows are watched and on what devices. (NY Times, 2013)
Web-trends
28
Web-trends
29
The world is here
Digital document archival is about here
Web-trends
30
RESTJSON
‘print-out’ of the internet conceivable*
Web-trends
31
RESTJSON
?Archive?
Cloud: beyond filing cabinets in the sky
storageas a service
Ubiquity as a service
Analytics as a service
Amazon: storage + analytics, pay-as-you-go
WebRTC: document & video sharing in browser
Officer End Web based video communication with document sharing that integrates with your work flow• Three-way Conferencing: introduce a supervisor, translator or sign
language interpreter into a session.• Secure communication: 100% end-to-end encryption. No specialised
software or hardware.• 100% Cloud Solution: utilising standard off-the-shelf equipment.
Beyond the Article Archive Concept• Def. Article: “A piece of data that is presented in a static, two-
dimensional form” Geoffrey Bilder, CrossRef, 2007
35Toby Green, presentation to OECD
Archive-able digital objects.
What to archive?supports a mashup
embedded in a report
A live data cube
Connected to live data
What to archive?
Live data-views
Many live data cubes
analytics services
Temporary data sources
aggregators
Policy reports
Summaries
Personal services: real-time policy from evidence• Speech-to-Text: document generated from voice
– With tagging automatically– And summarisation
• Full natural language processing
• Data mash-up
• Learned ontologies– find the missing data sets, and automatically glue them together
Customer churn has increased by 200% over
last decade
23.5% of population switch services provider
each year
Average cost to retain a customer is $500, about equal to cost to acquire
This costs the services industry in Australia $3bn per year
Customer churn has increased by 200% over
last decade
23.5% of population switch services provider
each year
Average cost to retain a customer is $500, about equal to cost to acquire
Customers are more savvy and organisations lack analytics
Why is this happening?
75% of consumers want to customise the experience they purchase
Solution?
This costs the services industry in Australia $3bn per year
And impacts government!
Why I don’t use my library card*:
New types of Access
Building code (think future data objects)• Old way
– Modular code– Download archive
• And modify• And build your new code
– Publish your modules– One team one product
• Now– Code libraries
• Repositories & build paths• Automatic download +
install of all dependencies• Lots of social support
– Self-writing book (!)– Open sharing
Archival is the ultimate “long tail”
Access ≠ “I stored it safely, so you can ask me to get it”
44
Stores, protects and makes accessible records of enduring significance which are identified as archival resources of the Commonwealth and selected for retention as 'national archives'.
Data.gov.au provides an easy way to find, access and reuse public datasets from Government. The main purpose of the site is to encourage public access to and reuse of government data by providing it in useful formats under open licences.
http://naa.gov.au/records-management/strategic-information/responsibilities/index.aspx
NASA data archives: external validation
API’s
School / education
Professional
Technical
Policy
Access & use guaranteed, demonstrated, validated
Pro
gra
ms
+ p
eopl
e ca
n ac
cess
Information Extraction: Existing documents into new
Inside the Archive(safe, not changed)
API’s
To make new doc’s…
Gets used as input
Doc. Info. Management AS A SERVICE (old way)Digital object: Doc, email, etc
D.Obj
create
meta
D.Obj
describe
meta
D.Obj
manage
meta
D.Obj
discover
Som
eone
Som
eone
els
e, la
ter
Archival catalogs
Archival storage
Archival search
Doc. Info. Management AS A SERVICE (old way)Digital object: Doc, email, etc
D.Obj
create
meta
D.Obj
describe
meta
D.Obj
manage
meta
D.Obj
discover
Som
eone
Som
eone
els
e, la
ter
meta
D.Obj
Re-use
Doc. Info. Management AS A SERVICE
analytics
Storage+
search
“maven”
“plug-in”
Open source
Standards-based
Interchangeable
Restful applicationinterface
Build open, easy-to-use services: let others build apps
3rd partiespublic
Other servicesWeb
Text analytics services
51
Free SnoMed CT encoding
Hospital infection control
Nursing handover
National Mapopen source, open data, open access
nationalmap.research.nicta.com.au
The future?• New records:
– not just digital objects, also the build & context information.
• New analytics
• New storage
• Users can do most of these by themselves, what value do you add?
54
Like this?
Work with us Change this:
thelivinglab.org.au