Numerical and Graphical Analysis Finding and understanding patterns in data.
-
Upload
cody-frost -
Category
Documents
-
view
217 -
download
1
Transcript of Numerical and Graphical Analysis Finding and understanding patterns in data.
![Page 1: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/1.jpg)
Numerical and Graphical Analysis
Finding and understanding patterns in data
![Page 2: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/2.jpg)
The course so far
• Academic use of the web.• Publishing on the web• Analysing text• Manipulating textual lists• Tables• Numerical Analysis – why it matters• Graphical Analysis – a better way for
humanists
![Page 3: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/3.jpg)
Lists
Ann Simms of Riverhead (Female) left £1560 died at age 89 years
Anne Potts of Ide Hill (Female) left £34 died at 17 years
Charles Forth of Chevening (Male) left £129 died at age 48 years
: : : :
: : : :
: : : :
George Salter of Riverhead (male) left £190, died at age 26 years
![Page 4: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/4.jpg)
Data as a table
Forename Surname Village Gender Wealth Age at death
Ann Simms Riverhead Female £1560 89
Anne Potts Ide Hill Female £34 17
Charles Forth Chevening Male £129 48
![Page 5: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/5.jpg)
Tables in a Spreadsheet
![Page 6: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/6.jpg)
Spreadsheet Software
• Evolved from financial accounting practice.• Tabular data• Simple lists• Simple databases• Establish relationships within and between
data sets – simple statistics• Apply various functions• Plot Graphs and Charts
![Page 7: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/7.jpg)
Applications in the humanities
• Maintaining and manipulating lists.
• Studying quantifiable information.
• Managing budgets and projects.
• Plotting graphs and charts.
• Compensating for weaknesses in other software applications.
• Building utility programs.
![Page 8: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/8.jpg)
What would your tutor’s comments be?
“ During the early 19th century the population of London grew rapidly due to mass migration in from the countryside. The overcrowding caused by the rising population placed a strain on the sanitation systems causing a series of cholera epidemics, each worse than the one before.”
![Page 9: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/9.jpg)
Quantify your statements: evidence?
“ During the early 19th century the population of London grew rapidly due to mass migration in from the countryside. The overcrowding caused by the rising population placed a strain on the sanitation systems causing a series of cholera epidemics, each worse than the one before.”
![Page 10: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/10.jpg)
Poetry or Maths?
Reproduced from: Burrows J. (2002) ‘‘Delta’: a Measure of Stylistic Difference and a Guide to Likely Authorship’, Literary and Linguistic Computing, Vol. 17:3 p. 270.
![Page 11: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/11.jpg)
Reproduced from: Burrows J. (2002) ‘‘Delta’: a Measure of Stylistic Difference and a Guide to Likely Authorship’, Literary and Linguistic Computing, Vol. 17:3 p. 280.
![Page 12: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/12.jpg)
Examples
Social History
- New Poor Law
- Effects of the Industrial Revolution
- Voting patterns in elections
• Textual analysis - word frequencies etc George Orwell, Author attribution
Shakespeare or Marlowe
![Page 13: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/13.jpg)
Why use numerical analysis?
• Wide variety of techniques –suitable for different types of data and questions.
• In the humanities it usually means ‘statistics’• Three Roles
- Summarise and compare data sets
- Test hypotheses
- Determine the significance of findings
![Page 14: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/14.jpg)
Research Process
• What is your question?• What results would prove/disprove it?• Write a code book defining
- variable names- variable data type
- categories, ranges (‘controlled vocabulary’ for numeric data)
• Code data • Analysis• Interpretation
![Page 15: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/15.jpg)
Authorship attribution
• Analysis of writing style
• Consistency of style
• Find frequently used words and look at their frequency in different portions of the book.
• End up with tables of frequencies and various indices – need to interpret them
![Page 16: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/16.jpg)
Simple Statistics
• To summarise a set of data
- mean average value- mode most
common value - medianmiddle value - range
minimum, maximum and thedifference between them
![Page 17: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/17.jpg)
Graphical Analysis
Allows us to:
• Summarise data
• Explore and identify areas for further study.
• To communicate the meaning of large volumes of data
![Page 18: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/18.jpg)
Variance and correlation
• Are two things related?
- ability in one language to another- poverty and disease
- smoking and cancer
• Mostly easily done by drawing a graph.
![Page 19: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/19.jpg)
Lung Cancer and Smoking
![Page 20: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/20.jpg)
With regression line fitted
![Page 21: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/21.jpg)
Variation in data
• How much variation is there in the data values?
• Standard deviation measures the deviation of the data from its mean
• Small value means very little spread
![Page 22: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/22.jpg)
What does this mean?
![Page 23: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/23.jpg)
Arrowhead length against width
0.00
1.00
2.00
3.00
4.00
5.00
6.00
0 2 4 6 8 10 12
Length
Wid
th
![Page 24: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/24.jpg)
![Page 25: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/25.jpg)
What does this mean?Plot of Average Age and House Number
0
2
4
6
8
10
12
0 10 20 30 40 50 60 70 80
Average Age
Ho
us
e N
um
be
r
![Page 26: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/26.jpg)
Warnings
• Think about what you are doing.
• A correlation does not mean there is a link.
• Even if there is a mathematical relationship it may not be a causal one.
• Beware of interpolated and extrapolated values.
![Page 27: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/27.jpg)
Correlation? Tufte (2001) p. 15
![Page 28: Numerical and Graphical Analysis Finding and understanding patterns in data.](https://reader036.fdocuments.us/reader036/viewer/2022062619/55161052550346a2308b5295/html5/thumbnails/28.jpg)
René Magritte: La Trahison des Images (1928-9) (The Treachery of Images)
Los Angeles County Museum of Art