Literacy in the Age of Big Data
-
Upload
centre-for-advanced-management-education -
Category
Education
-
view
37 -
download
0
Transcript of Literacy in the Age of Big Data
#IMDAYS // @michael_smit
Literacy in the Age of Big Data
Mike Smit
School of Informa9on Management Faculty of Management
#IMDAYS // @michael_smit
What is Big Data?
• Volume / Variety / Velocity, or • Anything more than I can handle, or • Data too large to be contained by a single computer, or
• Data beyond human scale, or • Data measured in TB or bigger, or • Anything I have a beFer chance of selling you by claiming it is Big Data.
#IMDAYS // @michael_smit
Twi;er Example • {"created_at":"Sat Nov 16 12:18:36 +0000 2013","id":401685732185899000,"id_str":"401685732185899008","text":"SpoFed this in the
Hicks building. At Dal, even the graffi9 is academically rigorous. hFp://t.co/n8jpJSGorN","source":"<a href=\"hFp://twiFer.com/download/iphone\" rel=\"nofollow\">TwiFer for iPhone</a>","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":2182346850,"id_str":"2182346850","name":"Richard Florizone","screen_name":"DalPres","loca9on":"Nova Sco9a","url":"hFp://dal.ca","descrip9on":"11th President -‐ Dalhousie University. Online as oien as possible.","protected":false,"followers_count":21,"friends_count":15,"listed_count":1,"created_at":"Fri Nov 08 14:49:09 +0000 2013","favourites_count":1,"utc_offset":null,"9me_zone":null,"geo_enabled":false,"verified":false,"statuses_count":5,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"hFp://a0.twimg.com/profile_background_images/378800000117347877/3f9b5575de267ee12db6c1b4eb6e6332.jpeg","profile_background_image_url_hFps":"hFps://si0.twimg.com/profile_background_images/378800000117347877/3f9b5575de267ee12db6c1b4eb6e6332.jpeg","profile_background_9le":false,"profile_image_url":"hFp://pbs.twimg.com/profile_images/378800000743858713/b7417c514d6e85dd67895f1802b784ae_normal.jpeg","profile_image_url_hFps":"hFps://pbs.twimg.com/profile_images/378800000743858713/b7417c514d6e85dd67895f1802b784ae_normal.jpeg","profile_banner_url":"hFps://pbs.twimg.com/profile_banners/2182346850/1384540163","profile_link_color":"0084B4","profile_sidebar_border_color":"FFFFFF","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"no9fica9ons":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"en99es":{"hashtags":[],"symbols":[],"urls":[],"user_men9ons":[],"media":[{"id":401685731921629200,"id_str":"401685731921629184","indices":[88,110],"media_url":"hFp://pbs.twimg.com/media/BZMTC4KIEAA62wo.jpg","media_url_hFps":"hFps://pbs.twimg.com/media/BZMTC4KIEAA62wo.jpg","url":"hFp://t.co/n8jpJSGorN","display_url":"pic.twiFer.com/n8jpJSGorN","expanded_url":"hFp://twiFer.com/DalPres/status/401685732185899008/photo/1","type":"photo","sizes":{"medium":{"w":600,"h":450,"resize":"fit"},"large":{"w":1024,"h":768,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"},"small":{"w":340,"h":255,"resize":"fit"}}}]},"favorited":false,"retweeted":false,"possibly_sensi9ve":false,"filter_level":"medium","lang":"en"}
#IMDAYS // @michael_smit
What is Big Data?
• Volume / Variety / Velocity, or • Anything more than I can handle, or • Data too large to be contained by a single computer, or
• Data beyond human scale, or • Data measured in TB or bigger, or • Anything I have a beFer chance of selling you by claiming it is Big Data.
WHY is Big Data?
#IMDAYS // @michael_smit
1. How much would it cost to buy enough hard drives to store all the music in the iTunes store? 2. Same quesOon, but pretend it is 10 years ago.
#IMDAYS // @michael_smit
Ds1fFtZx4olD5acndKSToGizuuj2D9Ut9prJlDLPpq35mNVHQghsDpGo13qZKpgF8Qe1xQnjKU0VEDwn3aXNTe4miEwbAq2WqkjWx2NZSH70kdK4x3h7L6E6DxnZrZeOlBZLXlFcCkluiScz0Ei13tqpALVvObQ3BnepwdPUpFSMnqvYaSQ4P3F6We9zXKIZDb9PGl8yyDw6XEAEMUcAq8mR4Z9WOY3XZG8b9QGwINtRMeTdKosHnTobzwf4gFFszjx1E0EJA22up0zg8Ub35gEd8wHc7yTmTZWZMU6hBVsEzhzcTaWx2wlYHstAiYjRAIAoYbuNupw0iWaxweJaCWl9y8J5zZ05YwTlAsh6jAl0Mp2RIkL3F00if8GGt3kaAzT5VQLHZSV1rJSTdMt9g1ldQsPm5U95oZN3Cx9B8sHbsgNINq5yiMjuVlO3rkCd1ShH210wGaIIlLpZ41U2$gK2fCEY5rvKU0p5sQHuIizchKc2zuGGP2FAZ3utFXhXLDIyhdzExe9VKo8DtAQqprlBOrkvdMjVWAs2Jj6H9GW4LrrFZrXY3VC6h6v3pZkDcfwmT6jRwkJrwvbuFCq0t4vOUBjPeSggukZKFAs1IryTkYKTPsJN5Lf5ZXhOqOcc9MB5MnkMAS1yqD5ayDv8kWWW29hLFRiSLF6zkEQA95yer84R91Lt3dfglI2yamX4UDO7j18ocflmcu9zfLklOLbR4Kg63GIvbfafqpv7wcNlBZ3Q3vJsjTmlbR6Is6kIlh3BQIF3W1QWosPhG9oNmR3bzTfK5gACtmgmBTAtKNrtRIK4XAfpRwmUZnBLYWJcjGIgjpD5237WhfZMFSEaMfOSi5SFD1aAq12D0cMh5WW
Reason #6: The lingering hope of finding valuable informaOon
#IMDAYS // @michael_smit
Where Do We Start?
• Admit you have a problem • S9ck together • Remain Calm! We fear what we don’t understand: data literacy educa9on.
• Analy9cs (self-‐serve business intelligence) • There is no subs9tute for human aFen9on… but when that’s not feasible, what else you got? – Idea: Cogni9ve Compu9ng for improved automa9on – Idea: Knowledge Graph for RM
• Records Management
#IMDAYS // @michael_smit
Historic Flood Database: A Big Data Approach
• Automa9cally processing newspaper ar9cles to produce open datasets describing geo-‐located floods in Nova Sco9a.
• Visual interface
Skills Gap
• Predicted for US in 2018 by McKinsey Global Ins9tute
Posi%ons:(465k(
Workforce:(300k(
Deep$Analy*cs$Skills$
Data Literacy
• The ability to create, comprehend, and communicate data.
• The ability to collect, manage, evaluate, and apply data, in a cri9cal manner.
• Spans disciplines, sectors, universi9es, …
Appendix 2 - Data Literacy Word Cloud The following is a word cloud generated from the major definitions of data literacy in the reviewed literature.
!
#IMDAYS // @michael_smit
Data Literacy EducaOon Conceptual Framework Introduction to Data
Knowledge and understanding of dataKnowledge and understanding of the uses and applications of data
Data Collection Data Discovery and Collection Performs data exploration Identifies useful data Collects data Evaluating and Ensuring Quality of Data and Sources
Crtically assesses sources of data for trustworthiness
Critically evaluates quality of datasets for errors or problems
Data Organization Knowledge of basic data organization methods and tools Asesses data organization requirements Organizes data
Data ManipulationAsesses methods to clean data Identifies outliers and anomalies Cleans data
Data Management Data Conversion (from format to format)
Knowledge of different data types and conversion methods
Converts data from one format or file type to another
Metadata Creation and UseCreates metadata descriptors
Assigns appropriate metadata descriptors to original data sets
Data Curation, Security, and Re-Use
Assesses data curation requirements (e.g. retention schedule, storage, accessibility, sharing requirements, etc.)
Assess data security requirements (e.g. restricted access, protected drives, etc.) Curates data
Data PreservationAssesses requirements for preservation Asseses methods and tools for data preservation Preserves data
Data Tools Knowledge of data analysis tools and techniques
Selects appropriate data analysis tool or technique
Applies data analysis tools and techniques
Basic Data AnalysisDevelops analysis plans Applies analysis methods and tools Conducts exploratory analysis Evaluates results of analysis
Compares results of analysis with other findings
Data Interpretation (Understanding Data) Reads and understands charts, tables, and
graphsIdentifies key take-away points, and integrates this with other important information
Identifies discrepancies within the data
Data Evaluation Identifying Problems Using Data
Uses data to identify problems in practical situations (e.g. workplace efficiency)
Uses data to identify higher level problems (e.g. policy, environment, scientific experimentation, marketing, economics, etc.)
Data Visualization Creates meaningful tables to organize and visually present data
Creates meaningful graphical representations of data
Evaluates effectiveness of graphical representations
Critically assesses graphical representations for accuracy and misrepresentation of data
Presenting Data (Verbally) Asssess the desired outcome(s) for presenting the data
Assesses audience needs and familiarity with subject(s)
Plans the appropriate meeting or presentation type
Utilizes meaningful tables and visualizationsto communicate data
Presents arguments and/or outcomes clealy and coherently
Data Driven Decisions Making (DDDM) (Making decisions based on data) Prioritizes information garnered from data Converts data into actionable information
Weighs the merit and impacts of possible solutions/decisions Implements decisions/solutions
Critical Thinking Aware of high level issues and challlenges associated with data Thinks critically when working with data
Data Culture Recognizes the importance of data
Supports an environment that fosters critical use of data for learning, research, and decision-making
Data Application Data Ethics Aware of legal and ethical issues associated with data Applies and works with data in an ethical manner
Data Citation Knowledge of widely-accepted data citation methods Creates correct citations for secondary data sets
Data Sharing Assesses methods and platforms for sharing data Shares data legally, and ethically
Evaluating Decisions Based on Data
Collects follow-up data to assess effectiveness of decisions or solutions based upon data Conducts analysis of follow-up data
Compares results of analysis with other findings
Evaluates decisions or solutions based on data
Retains original conclusions or decisiosn, or implements new decisions/solutions
#IMDAYS // @michael_smit
There is no subsOtute for human a;enOon
But some9mes we have too much data and not enough humans!
#IMDAYS // @michael_smit
Image Credits (1)
• hFp://www.scien9ficamerican.com/media/inline/blog/Image/wisdom.jpg • hFp://rudyloans.com/wp-‐content/uploads/2013/11/Arrow-‐Up-‐4.jpg • hFp://www.mrwallpaper.com/cat-‐and-‐dog-‐cuddle-‐wallpaper/ • hFp://poFermore.wikia.com/wiki/Category:Gryffindor • hFp://poFermore.wikia.com/wiki/File:Slytherin_mark.png • hFp://daverobertsfilm.wordpress.com/2011/02/02/media-‐studies-‐key-‐debates/ • hFp://www.themobilityresource.com/wearable-‐technology-‐and-‐how-‐it-‐affects-‐
people-‐with-‐disabili9es/ • Original source unknown; available • hFp://adpaascu.wordpress.com/tag/global-‐ci9zens/ • hFps://www.torproject.org/ • hFp://www.gnupg.org/ • hFp://www.iden9tyfinder.com/
42
#IMDAYS // @michael_smit
Image Credits (2)
• hFp://www.gartner.com/technology/research/hype-‐cycles/ • hFp://blog.udacity.com/2013/07/new-‐course-‐design-‐of-‐everyday-‐things.html • Screenshot from hFp://pennystocks.la/internet-‐in-‐real-‐9me/ • hFp://www.officeimaging.com/ • hFp://www.clipartbest.com/gradua9on-‐caps-‐clip-‐art • Cost per GB from hFp://www.mkomo.com/cost-‐per-‐gigabyte-‐update • Images on slides 47, 57, 58 are © Mike Smit, 2014. • Slide 35: screenshot of personal laptop & cell phone • Slide 37: Vancouver Archives, hFp://searcharchives.vancouver.ca/power-‐lines-‐
and-‐suppor9ng-‐structure-‐in-‐lane-‐west-‐of-‐main-‐street-‐at-‐pender-‐street • Slide 43: Screenshot of Watson User Modeling. Made from my own copy of their
demo applica9on, but also available publicly at hFp://watson-‐um-‐demo.mybluemix.net/
43
#IMDAYS // @michael_smit
Image Credits (3)
• All graphs were created for the purpose of this presenta9on • Logos on slide 38 are from the respec9ve websites • Images on slide 39:
– BoFom lei: Thalmic Labs via TechCrunch hFp://techcrunch.com/2013/06/05/thalmic-‐labs-‐raises-‐14-‐5m-‐to-‐make-‐the-‐myo-‐armband-‐the-‐next-‐big-‐thing-‐in-‐gesture-‐control/
– Top lei: Apple.com – Top right: fitbit.com – BoFom right: hFps://www.google.ca/glass/start/
• Slide 41: hFp://www.geekwire.com/2013/ibm-‐takes-‐watson-‐cloud/
44