The UK digital technology census 2019 … · The data The data used in this report is DataCity UK...
Transcript of The UK digital technology census 2019 … · The data The data used in this report is DataCity UK...
The UK digital technology census 2019Going beyond SIC Codes to quantify the UK Digital / Technology Industries
Thank you for downloading this report. The Data City provides a 21st
century methodology to understand the activities of organisations, sectors and research institutions. It has been utilised to analyse the technology & digital industries of the UK.
This represents a significant evolution from SIC code based analysis which is unable to provide insight on new and emerging sectors of industry.
The Data City is a Data as a Service (DaaS) company that uses open data, the web and artificial intelligence to create new, robust, real-time data relating to the economy and innovation in the UK.
This census builds on work we have been completing with Greater Manchester Combined Authority (GMCA) and MIDAS – Manchester’s Inward Investment Agency, over 2018 and 2019.
In publishing this report we are seeking to show the power of our data in terms of its ability to provide new insight on UK industry.
To showcase the power of our data we have focussed on two main themes;
1. Holistic Census.A holistic view or ‘census’ of the digital technology sector across organisations and events/meetups.
2. Emerging Technology Census. A deeper dive into the organisations, showing how our data can be used to measure new technologies and their impact on our industries and the UK economy.
Introduction
The Data City
1
Introducing RTIC RTIC is an abbreviation for ‘Real Time Industrial Classification’ which refers to the industry categorisation given to organisations using the Data City methodology. By monitoring the digital footprint of our industries we are able to update our understanding of what organisations do in real time.
The dataThe data used in this report is DataCity UK Organisations Data and EventsData assets.
The DataCity will be applying two new data assets in the next 12 months;
1. DataCity - Global Academic Innovation Data Asset, Q4 2019.
2. DataCity – UK Funding & Investment Data Asset, Q2 2020.
We place this data in context,
• RTIC codes (Real-Time Industrial Classification).
• Geography – OECD City Classifications.
• UK Travel Time & Demographic Data.
Introduction
2
The Data City & the UK Digital Technology Census
Executive summary
This report summarises the key findings of our 2019 UK Digital Technology Census. Access to the raw data used to create this report can be obtained through the purchase of a Data City licence, please email us at [email protected] and we will send you the details of the evaluation and product licence.
Organisations
We have classified organisations 929,478 times in this data set.
There are a total of 810,813 unique organisations in the data set.
Events
We have classified 496,619 events in this data set.
Key findings
This report provides analysis of the UK Digital Technology sector at two levels, the first being a holistic overview and the second being a deep dive into organisations engaged in emerging technologies.
Holistic Census
This analysis shows the digital technology industries are over 25% larger than previously thought. London clearly leads the UK digital technology rankings for events and organisations. Manchester ranks 2nd and Reading 3rd.
Emerging Technology Census
London also leads the way on emerging sectors that are driving the new global technology economies of the 21st Century, with Manchester 2nd and Birmingham 3rd.
3
The top 10 cities
This table shows the combined totals for events and organisation classifications for the top ten cities.
Key findings
Events count
Events rank
Organisations count
Orgs rank
Combined rank score
1st London 5,145 1 133,737 1 2
2nd Manchester 483 2 11,622 3 5
3rd Reading 157 9 11,642 2 11
4th Brighton 359 3 4,512 8 11
5th Milton Keynes 163 7 7,705 6 13
6th Birmingham 207 6 5,547 7 13
7th Leeds 352 4 4,111 9 13
8th Bristol 134 11 10,515 4 15
9th Edinburgh 352 12 3,515 10 22
10th Glasgow 59 18 10,282 5 23
4
The Data City data sources
3.4 million Web pages
400,000 tech events
4.06 million companies
Organisations
The Data City starts with Companies House data on 4.06 million active UK based companies.
Where possible these organisations are matched to one or more websites that describe what their organisation does, this currently stands at around 1.3 million organisations matched to websites.
The most relevant pages of these websites are classified by our machine learning algorithms into sectors of interest.
Events
The Data City uses open data on events in Eventbrite and Meetup, updated weekly and creates a rolling 12 month data asset, containing approximately 400,000 events at any given time.
These events are classified by our machine learning algorithms into sectors of interest.
Methodology
5
ClassificationBuilding the Census
For the purposes of this report we have focussed on eight digital/technology subsectors to create our digital technology sector. These sub sectors are:
1. Artificial Intelligence (AI) & Data2. ECommerce3. Cyber4. Digital (as a sector in its own right)5. Gaming6. Internet of Things (IoT)7. MedTech8. FinTech
We then looked further into the AI & Data and Digital sectors within our organisations data set to create two additional sectors:
1. AI – separating this from the AI & Data sector2. Advanced digital – a group of organisations using more advanced
applications of digital technologies
From this we produced a separate analysis of this subset of sectors from the holistic digital technology sector.
MACHINE LEARNING
ARTIFICIAL INTELLIGENCE
INSIGHTS
Methodology
6
Defining the sectorsOrganisations
We first utilise any relevant SIC codes, for instance in Gaming the SIC code: ‘62011 - Ready-made interactive leisure and entertainment software development’ is clearly relevant to this sector.
Where there are no relevant SIC codes, we then choose a training set of companies that are clear examples for the sector and use our machine learning process to analyse the matched company websites in our database against this representative training set. The output is then checked and a training set of false positives created to be used to remove other false positives. The ML process is then run through repeated cycles to achieve an ever more accurate output. When random samples return a less than 1% error rate we consider the sector definition complete.
This method achieves a number of advantages over SIC code based classification:
ü We can classify industries that don’t have SIC Codes such as FinTech, MedTech or IoT.
ü We can correct misclassifications (for instance over 10% of UK organisations SIC code is ‘other’).
ü We can add organisations to multiple classifications, for instance showing a manufacturing company has embraced IoT technology.
ü Organisations can be classified into more than one sector.
Methodology
7
Methodology
Events
The Data City events database contains 496,619 events from Eventbrite and Meetup that have taken place since January 2018. We classify these into sectors by testing for the presence in the event name or description of 397 keywords strongly associated with our sectors. If enough of these keywords are present the event is classified as being associated with that sector.
We also reverse geocode the latitude and longitude of events to assign them to standard UK and global geographies, such as countries, NUTS regions, 2018 OECD Metropolitan Areas, and local authorities. Of our events 138,275 took place in the UK. For the geographies chosen for our work with GMCA we found 9191 events within the eight sectors of interest (Digital, Fintech, AI and Data, Future of Mobility, eCommerce, IoT, Gaming, MedTech).
Geographies
Each organisation location is recorded at postcode level. Every event, and organisation in The Data City database has at least one location. In all cases these locations are stored as exact co-ordinates. For different purposes we group together events, papers, and organisations into neighbourhoods, cities, regions, and countries. To ensure international comparability with our city groupings we use the OECD functional urban area definitions and boundaries. But this has problems in the UK, specifically that some of our most economically important cities such as Oxford, Cambridge, and Brighton have too low a population to be included in this list. For this reason we supplement our list of cities in the UK with those listed in “Delineating urban agglomerations across the world: a dataset for studying the spatial distribution of academic research at city level” by Maisonobe et al. This gives us 29 cities in the UK to use in comparisons.
The Data City provides an ever-evolving data asset for use in understanding our industries which, like SIC codes before us, we accept will never be perfect. If any of our findings should appear to contradict other data-sources, or if you have any other questions we welcome discussion on how our data can be improved and would encourage communication in this regard or by emailing us at: [email protected]
8
Holistic Census
Holistic census
City breakdown by sector
CITY AI AND DATA CYBER DIGITAL ECOMMERCE FINTECH GAMING IOT MEDTECH
London 1 1 1 1 1 1 1 1
Manchester 2 3 2 2 2 2 3 2
Reading 3 2 5 6 3 12 2 16
Bristol 4 8 3 5 8 6 6 5
Birmingham 7 5 6 3 5 5 9 4
Leeds 9 7 7 4 6 3 12 7
Edinburgh 5 11 8 11 4 13 5 6
Brighton 6 6 4 8 7 4 11 9
Glasgow 10 12 11 10 10 10 13 10
Milton-Keynes 16 4 10 9 9 14 4 13
Nottingham 12 10 12 7 14 9 8 24
Newcastle 11 15 9 14 11 8 15 18
Cambridge 8 23 13 15 12 19 7 3
Oxford-Didcot 13 22 14 17 17 17 17 11
Leicester 22 14 15 12 22 11 10 21
Sheffield 15 13 17 16 15 20 18 8
Liverpool 17 17 18 13 16 16 16 12
Cardiff 14 18 19 20 13 22 14 17
Canterbury 18 9 16 18 19 7 20 19
Norwich 21 20 20 19 21 15 19 26
Exeter 19 19 22 23 20 18 26 15
Portsmouth 24 16 21 24 18 25 21 27
Aberdeen 20 24 24 27 23 24 24 20
York 25 25 23 25 26 23 22 29
Bradford 23 21 25 21 25 26 25 14
Plymouth 28 26 28 22 28 27 29 23
Swansea 26 27 26 26 24 29 23 28
Dundee 29 28 27 28 27 21 28 22
Lancaster 27 29 29 29 29 28 27 25
Top three positions are highlighted for convenience.
This table shows how each city ranks across each of the eight sectors analysed. This is that combined ranking for events and organisations.
10
LANCASTER, 300
SWANSEA, 580
DUNDEE, 760
ABERDEEN, 831
PLYMOUTH, 861
YORK, 884
NORWICH, 1,546
BRADFORD, 1,633
EXETER, 1,729
OXFORD-DIDCOT, 1,752
CAMBRIDGE, 1,764
CARDIFF, 2,112
LIVERPOOL, 2,419
PORTSMOUTH, 2,581
SHEFFIELD, 2,627
NEWCASTLE, 2,799
CANTERBURY, 2,803
LEICESTER, 3,003
NOTTINGHAM, 3,178
GLASGOW, 3,520
EDINBURGH, 4,121
BRISTOL, 4,518
LEEDS, 5,556
BIRMINGHAM, 7,715
MILTON-KEYNES, 10,292
BRIGHTON, 10,520
MANCHESTER, 11,642
READING, 11,656
LONDON, 133,920
UK organisations in the digital technology sectorHere we show the combined counts of organisation classifications for our ‘holistic census’ for each city.
The sectors are:
1. Artificial Intelligence (AI) & Data2. ECommerce3. Cyber4. Digital (as a sector in its own right)5. Gaming6. Internet of Things (IoT)7. MedTech8. FinTech
Holistic census
11
BRADFORD, 7
PORTSMOUTH, 11
PLYMOUTH, 12
DUNDEE, 15
LANCASTER, 18
SWANSEA, 30
YORK, 39
EXETER, 44
CANTERBURY, 45
ABERDEEN, 45
LEICESTER, 48
MILTON-KEYNES, 59
NORWICH, 61
SHEFFIELD, 75
LIVERPOOL, 80
CARDIFF, 103
NOTTINGHAM, 115
GLASGOW, 122
BRIGHTON, 134
NEWCASTLE, 143
READING, 157
OXFORD-DIDCOT, 158
BIRMINGHAM, 163
LEEDS, 207
CAMBRIDGE, 272
EDINBURGH, 352
BRISTOL, 359
MANCHESTER, 483
LONDON, 5,145
UK events in the digital technology sector
Here we show the combined counts of event classifications for our ‘holistic census’ for each city from 1 April 2018 – 31 March 2019.
The sectors are:
1. Artificial Intelligence (AI) & Data2. ECommerce3. Cyber4. Digital (as a sector in its own right)5. Gaming6. Internet of Things (IoT)7. MedTech8. FinTech
Holistic census
12
Emerging technology census
Focussing exclusively on the emerging digital technology sectors of UK industry we can rank our cities to identify hot spots of innovation. We have removed the well-established and increasingly ‘traditional’ digital and data sectors for this analysis, organisations that the data indicates
are not embracing these innovations and will therefore not be included in these counts.
We have also created a new training set of ‘advanced digital’ businesses that represent the leading edge of the digital organsiations to create a new ‘advanced digital’ sector.
The emerging tech
Emerging tech census
Here we can clearly see the limitations of SIC code classifications with over 58% more organisation classifications found under The Data City method.
Sector All Count SIC Count RTIC Count Difference
AI 2594 0 2594 2594 100.0%
Advanced Digital 12896 0 12896 12896 100.0%
IoT 1556 0 1556 1556 100.0%
FinTech 6205 0 6205 6205 100.0%
Cyber 21463 0 21463 21463 100.0%
eCommerce 61812 49519 12596 12293 19.9%
MedTech 453 0 453 453 100.0%
Gaming 30547 7205 23954 23342 76.4%
Totals 137526 56724 81717 80802 58.8%
14
Emerging tech census
Top 10 cities – emerging technology
Ecommerce Gaming Cyber Advanced Digital FinTech AI IoT MedTech
1st London 23390 10655 7646 4922 2543 1162 563 192
2nd Manchester 2863 1232 747 517 202 89 58 20
3rd Birmingham 1790 743 599 354 145 47 46 7
4th Reading 1290 700 830 478 161 103 83 14
5th Milton-Keynes 1471 769 664 355 119 54 52 10
6th Brighton 1338 843 526 364 117 52 28 9
7th Leeds 907 528 372 271 104 39 20 8
8th Bristol 648 373 295 201 55 30 20 6
9th Glasgow 667 366 197 145 58 20 10 3
10th Edinburgh 520 360 198 141 87 38 19 6
When we focus on the emerging technologies we find the ranking changes for the top ten cities. This table shows how each city ranks across each of the eight emerging technology sectors analysed. This ranking is based on organisations data only.
London
Manchester
Birmingham
Reading
Milton-Keynes
Brighton
Leeds
Bristol
Glasgow
Edinburgh
Ecommerce Gaming Cyber Advanced Digital FinTech AI IoT MedTech
15
The top 10 cities
Emerging tech census
AI
Advanced digital
Internet of things
Fintech
Cyber
Ecomm
erce
Medtech
Gam
ing
1st London
Manchester
Summary counts for emerging technology sectors for the top ten cities.
2nd
SIC RTIC0 11620 49220 5630 25430 7646
19598 39290 192
3032 7819
SIC RTIC0 890 5170 580 2020 747
2467 4170 200 89
16
Emerging tech census
AI
Advanced digital
Internet of things
Fintech
Cyber
Ecomm
erce
Medtech
Gam
ing
Birmingham
Reading
Milton-Keynes
Brighton
4th
5th
6th
3rd
SIC RTIC0 1030 4780 830 1610 830975 3180 14187 531
SIC RTIC0 470 3540 460 1450 599
1496 3060 7129 624
SIC RTIC0 540 3550 520 1190 664
1147 3300 10158 625
SIC RTIC0 520 3640 280 1170 526
1028 3200 9272 610
17
AI
Advanced digital
Internet of things
Fintech
Cyber
Ecomm
erce
Medtech
Gam
ing
Emerging tech census
Leeds
Bristol
Glasgow
Nottingham
8th
9th
10th
7th
SIC RTIC0 390 2710 200 1040 372735 1730 8103 434
SIC RTIC0 300 2010 200 550 295459 1930 6105 286
SIC RTIC0 200 1450 100 580 197552 1190 373 301
SIC RTIC0 240 1260 230 530 204583 1460 074 233
18
Summary of data
When we combined all ten sectors, and looked at the differences between how these would have been quantified using SIC codes verses RTIC codes, we can see that RTIC codes are able to classify 25% more organisations.
Summary of organisations data
Summary of data
Sector All Count SIC Count RTIC Count Difference
AI 2594 0 2594 2594 100.0%
Advanced Digital 12896 0 12896 12896 100.0%
Data 9421 9421 0 0 0.0%
IoT 1556 0 1556 1556 100.0%
Fintech 6205 0 6205 6205 100.0%
Cyber 21463 0 21463 21463 100.0%
eCommerce 61812 49519 12596 12293 19.9%
MedTech 453 0 453 453 100.0%
Digital 202918 193267 12896 9651 4.8%
Gaming 30547 7205 23954 23342 76.4%
Totals 349865 259412 94613 90453 25.9%
20
Thanks for reading
If you have any questions or are interested in a subscription to our data then please contact us.
The Data CityODI Leeds3rd Floor, Munroe HouseLeedsLS9. 8AG