A Platform for Exploring Social Media Analytics of …...A Platform for Exploring Social Media...

A Platform for Exploring Social MediaAnalytics of Fast Food Restaurants

in Australia

Chang Liu(&) and Richard O. Sinnott(&)

University of Melbourne, Melbourne, VIC 3010, [email protected], [email protected]

Abstract. Social media is one of the primary communication tools for Internetusers. Twitter, one of the most popular social medias, has more than one hun-dred million daily active users. These tweeters tweet a large number of tweetsevery day containing a rich and diverse collection of information. At the sametime, the problem of obesity is becoming a serious issue all over the world. Inthis paper, we consider the impact of the geographical location of fast foodrestaurant on the body mass index (BMI) and levels of obesity of individuals inMelbourne through data analytics around social media.

Keywords: Social media � Twitter � Content analysis � Fast food restaurant

1 Introduction

In 2000, the world health organization (WHO) gave a warning that obesity had becomeone of the most serious health problems of the 21st century. In 2016, the WHOidentified that there are about 1.9 billion overweight adults of whom at least 650million are medically obese [1]. Obesity can lead to many secondary disorders, such ascardiovascular issues, diabetes and cancer. However, as with many health issues, levelsof BMI and obesity is a personal issue, and access to such personal BMI information islimited because of individual level privacy [2]. To tackle such issues, one approach isto aggregate data to spatial areas, e.g. postcode averages.

Human activities are rarely completely isolated and independent, and hence theinterconnection of otherwise independent data resources has many advantages espe-cially in health settings. In the Internet-era, people are now dependent on using andexchanging information. From email to blogs to Social Network Services such asTwitter, human activities are increasingly captured, and people are now paying moreattention to the subsequent use of data from these services.

As one example, Twitter is an online micro-blogging service and social networkingtool. It has more than one hundred million daily active users. In Australia, there are 2.9million tweeters in the total population of 24.13 million people [3]. These tweeterstweet a large number of tweets every day. On average, there are around 6,000 tweetsare sent every second with over 350,000 tweets sent per minute, 500 million tweets sentper day, and over 200 billion tweets sent per year. Twitter offers a low cost, real-timeand extensive resource to understand a wide range of phenomenon, e.g. if a person

© Springer International Publishing AG, part of Springer Nature 2018O. Gervasi et al. (Eds.): ICCSA 2018, LNCS 10960, pp. 231–244, 2018.https://doi.org/10.1007/978-3-319-95162-1_16

http://orcid.org/0000-0003-3011-6250

http://orcid.org/0000-0001-5998-222X

http://crossmark.crossref.org/dialog/?doi=10.1007/978-3-319-95162-1_16&domain=pdf



tweets ‘I need to lose weight’ there is a high probability that this person is overweight.Identifying population-wide patterns in such tweets can reveal societal patterns [4].

This paper focuses on such capabilities. It involves large-scale data collection anddata pre-processing with keyword-based methods derived from Natural LanguageProcessing approaches with supervised classification and long-short term memorymodels from machine learning. Lydecker and Cotter used ‘fat’ as the keyword [9] andAnwar and Yuan used 11 hashtags as keywords [10] to gather tweets. These keywordsare widely used and easily understood by a majority of people, but they may notenough to cover most tweet which describe ‘fat’ and ‘fast food’. Besides, theobesity-related conversations are not very socially and personally rooted [10]. In thispaper, we used existing keywords to get some tweets. Then Word Frequency was usedto get more proper keywords basing on these tweets.

The specific research question explored here is to establish the relationship betweenthe number of overweight people and the number of fast food restaurants in a particulararea. Specifically, we focus on collecting data on fast food restaurants in Melbourne,via the Google Maps API and from Twitter. We combine these data sets with otherofficial data sets on health and wellbeing from the Australian Urban Research Infras-tructure Network (AURIN – www.aurin.org.au) resource to explore the correlationbetween people’s health and the distribution of fast food restaurant. Such analysis canbe used to influence policy on establishment of restaurants for example.

The remainder of this paper is organized as follows. Section 2 focuses on thebackground and provides a literature review. Sections 3 presents the methods used andthe implementation details. Section 4 focuses on the analysis and presentation ofresults. Finally, Sect. 5 concludes this paper and discusses areas of future work.

2 Background

2.1 Spatial Aggregation Levels

There are many ways to classify the use of land: postcodes, states, countries. InAustralia, Statistical Area such as Statistical Area Level 2 (SA2) are one way topartition the use of land in Australia. They typically represent a community such as anamed suburb that aggregates data across a range of social and economic aspectsrelevant to that SA2. In 2016, there were 2310 SA2 regions in Australia. It is noted thatthere are many other spatial aggregation levels that exist: local government areas(LGAs), functional economic regions, census districts, and statistical areas (SA4-SA1)amongst numerous others. It is also noted that these spatial levels evolve over time, e.g.as cities grow and change their population profiles.

2.2 Twitter

Twitter is an online micro-blogging service and social networking tool that allows usersto post and interact with other users by using short message ‘tweets’. Tweets are140-character long messages including space and punctuation. Many tweets containpictures, links to content and emojis. By using Twitter, users can connect with acommunity of people who share the same interests.

232 C. Liu and R. O. Sinnott

http://www.aurin.org.au

The majority of tweets are broadcast publicly and everyone that uses Twitter canaccess and read them. Tweets track details of the time when these tweets were posted.The timeline is one primary way people engage with the Twitter. Tweets from otheraccounts that the user follows will appear in the timeline chronologically, and thetimeline will update in real time. ‘@’ which is a symbol known as mentions are used toreply to tweets. The ‘@’ mentions are added at the start of tweets, followed by the userwho is replying. Twitter uses this feature to track the association between the tweets asa form of ‘conversation.’ The last important feature that Twitter provides is hashtags. InTwitter, the hashtags turn words into clickable links that relate to another tweet sharingthe same hashtag. This provides a search mechanism for contents included in a set oftweets [5].

The users of Twitter generate a large amount of information every day, and manytweets contain the location information of the users. Such major (global) volumes ofdata allow for a rich range of subsequent uses for Twitter analytics.

2.3 Deep Learning

Deep learning is a field stemming from origins in artificial neural networks andmachine learning and specifically multi-layer artificial neural networks. A layer of theneural network will typically use a large number of matrix numbers as input, then usethe non-linear activation method to achieve the associated weight, before generatinganother data set as the output [8].

There are two main models driving forward rapid evolution in the area of deeplearning: Convolution Neural Networks (CNNs) and Recurrent Neural Networks(RNN). In this paper we focus on RNNs.

An RNN is a form of artificial neural network which can be used to identify serialdata from resources as diverse as text, genome, handwriting and voice. When readingan article, a human will use the understanding of the preceding text to understand thecurrent text, rather than simply reading the current text and discarding any under-standing of the older/previous text. Human memory is persistent. However traditionalneural networks cannot infer the next text classification based on the previouslyclassified text. The RNN solves this problem by including loops which can keep theprevious information [8].

Long-Short Term Memory (LSTM) is a special form of RNN model that can beused to solve problems with gradient disappearance. LSTM preserves errors for reversedelivery over time and layers. It keeps errors at a more constant level, allowing therecursive network to perform many time steps thereby opening opportunities forestablishing long distance causal links. LSTM stores the information in a gated unitoutside the normal recursive network traffic. These units can store, write or readinformation. The unit determines which information is stored through a switch thatallows to read, write, or clear the information. Unlike the digital memory in a computer,these gates are analog and contain element-by-factor multiplication of the sigmoidfunction with all the output ranges between 0 and 1. Compared to digital storage, theadvantages of analog values can be differentiated, hence they are suitable for reversepropagation.

A Platform for Exploring Social Media Analytics of Fast Food Restaurants 233

The gates are switched depending on the received signal as similar to the nodes of aneural network. They use their own weight sets to filter the information and decidewhether or not to allow information to be passed according to its strength and content.These weights, like the weights of the modulation input and the hidden state, areadjusted by the recursive network learning process. That is, the memory unit will, byestimation, support error back propagation, with a gradient descent that adjusts theweight of the iterative process to learn when to allow data to enter, leave or be deletedfrom the network.

3 Method and Data Processing

3.1 Google Maps API

To obtain the detailed address information of fast food restaurants in Melbourne, aradar search method is used [7]. As the name implies, this method searches the targetbased on keywords associated with a given centroid.

Melbourne has many fast food restaurants. The area from [−37.5, 144.5] to [−38.5,145.5] is broken into several 5 km circular areas, as shown in Fig. 1. To ensure noomission of fast food restaurants, these circles have overlaps covering all of the areasfrom [−37.5, 144.5] to [−38.5, 145.5].

This approach means that fast food restaurants located in overlapping areas may berecorded more than once. To remove duplicate fast food restaurants’ theirgeo-information is used directly. In this work, a known set of fast food chains wasexplored and their associated keywords used to collect the branch location information.Specifically the fast food chains and keywords included ‘KFC,’ ‘McDonalds,’ ‘Mac-cas’, ‘Subway,’ ‘Domino,’ ‘Pizza Hut,’ ‘Nandos’ and ‘fastfood.’ The collected resultsare shown in Table 1 below.

After removing the duplicate information and the restaurant out of range, 762 fastfood restaurants’ and their geo-information was obtained from Twitter.

Fig. 1. Partition method of the Google Maps radar search


3.2 Data Pre-processing

In order to perform natural language processing (NLP) tweets need to be pre-processed.The first step of NLP is tokenization, e.g. identifying terms and removing stop words.Tweet text is different from normal prose since it may have hashtags, mentions, emojisand URLs. In order to tokenize a tweet, a tokenizer is built using regular expressions. Inthis work, eight regular expressions were used to tokenize the tweet text. The secondstep is tweet normalization. Initially, all of the characters in the token should translateto lowercase letters. Then, lemmatization is used to solve word tense problems. Thethird step is removing the meaningless token from the Tweet text token list, e.g.removing stopwords. The punctuation, stopwords, retweets (prefixed with RT) and via(used to mention the original author of a tweet or a retweet) were removed.

3.3 Word Count

The word count is the easiest analysis used to find fast food restaurant-related words.Through this method, the word which most commonly used in the data set can befound. A bag-of-words is used to get the word counts. A bag-of-words model canrepresent a sentence or a document as a feature vector. The basic idea is to treat text asa collection of words. By searching for tweets which contain at least one of theaforementioned eight keywords, 6229 tweets were identified that contained at least oneof the nine words. By identifying bigrams and combining the tf-idf value of all wordsand phrases with the high-frequency words and phrases, 38 normal words and 9 searchwords were chosen as the basis for fast food restaurant related words:

['#maccas', '#McDonald', '#kfc', '#dominos', '#subway', '#nandos', '#pizza hut' ‘fastfood’, ‘fast food’, 'carry-out', 'eat in', 'drive thru', 'franchise', 'menu', 'combo', 'nutrition', 'beverage', 'soft drink', 'fountain drink', 'slushie', 'smoothie', 'coffee', ‘sub’, 'bun', 'muffin', 'scone', 'biscuit', 'sides', 'condiments', 'dressing', 'fries', 'fried chicken', 'hash browns', 'onion rings', 'burger', 'chicken', 'sausage', 'hotdog', 'bacon', 'beef']

By using the keywords ‘fat’, ‘overweight’ and ‘obese’ to find all tweets related tobeing overweight in the data set, 2574 tweets, which contained at least one of the threewords were found. Combining the tf-idf value of all words and phrases with thehigh-frequency words and phrases, 68 normal words and phrases were chosen as theoverweight related words:

Table 1. Different Melbourne fast food restaurants identified via Twitter

Keyword Number Keyword Number

KFC 154 Pizza Hut 77McDonalds (maccas) 208 Nandos 97Subway 108 Fastfood 526Domino 245


[' fat ', 'overweight', 'obese', 'obesity', 'lose weight', 'low fat', 'slim', 'reduce', 'keep fit', 'fatty', 'adipose', 'fatty tissue', ‘chubby', 'plump', 'podgy', 'tubby', 'blubber', 'sebaceous', 'corpulent', 'pudgy', 'greasy', 'avoirdupois', 'portly', 'rotund', 'zaftig', 'fatten', 'blubbery', 'fleshy', 'oily', 'potbellied', 'dumpy', 'juicy', 'paunchy', 'porcine', ‘buxom', ‘buttery', ‘thick', ‘jowly', ‘thickset', ‘rich', ‘embonpoint', ‘oleaginous', ‘stocky', ‘gross', ‘profitable', ‘weighty', ‘fatten up', ‘heavyset', ‘stout', ‘fill out', ‘heavy', ‘fertile', ‘compact', ‘productive', ‘rounded', ‘fruitful', ‘abdominous', ‘double-chinned', ‘endomorphic', ‘fatten out', ‘fattish', ‘flesh out', ‘loose-jowled', ‘plump out', ‘pyknic', ‘suety', ‘superfatted', ‘zoftig']

There were only 83 tweets which contained at least one of the fast food restaurantkeywords and at least one overweight keyword. Since this was insufficient for training,the, 47 fast food restaurant-related words and phrases were also used as keywords.

3.4 Long-Short Term Memory Network (LSTM)

A unique feature of NLP data is that every word in a sentence depends on the previouswords and impacts on the subsequent word. Because of this dependency, an RNN isused to analyze this kind of series data. The LSTM is a special RNN, it preserveslong-term dependency information in text. To get the training set, the tweets data setneeds to be pre-processed. In this work, only three features were used: ‘created_at’,‘text’ and ‘user_id’. By using the 47 fast food restaurant related words and 68 fatrelated words identified above to search these tweets the specific obesity related tweetswere obtained. After data pre-processing and searching, 13,420 tweets were identifiedwhich contained obesity related words, and 922 tweets found which contained both fastfood restaurant related words and obesity-related words.

The input of the LSTM should be a word vector. Before building the LSTM model,the training data should be changed from text to word vectors. Key to this is ensuringthat the size of the vocabulary of the model includes sufficient different words. This isbecause this vocabulary is used to build the one-to-one correspondence between theword and value. Building the vocabulary is an important step to change a sentence intoa word vector [6]. A low frequency word which may just appear one time in the textmay not need to be added into the vocabulary. Words which are not in the vocabularyare replaced by the pseudo word ‘UNK’. After processing all the text in the data set,24,490 different words existed in this data set with 6,748 words appearing more thantwice. As a result, the size of the vocabulary was set to 6,750 with the first 6758 wordssorted by the word frequency from large to small in the training set with one pseudoword UNK.

The length of sentence should also be fixed, since in the RNN model the wordvectors are analyzed in a matrix. The longest sentence contains 32 words. Since a tweettext can contain at most 140 characters, the length of the sentence is set as 30. If asentence is less than 30 terms, the filling word 0 should be added to the short sentenceuntil it reaches a length of 30.


A lookup table is used to convert the words and their appearances based on thevocabulary. All of the text in the data set is converted into word vectorx by using thelookup table. After the conversion, if the word vector is longer than 30 terms, the partwhich is out of range is cut off. If the word vector is shorter than 30, the filling word 0is added until the length reaches 30. After these processes, the text has been changedinto a fixed length word vector matrix.

Following this, the word vector data set is split with 80% of the data used as thetraining set and the rest used as the test set. These data sets are used to train the LSTM.The LSTM model from Keras is used with the loss function based on ‘bi-nary_crossentropy’ and optimization method using ‘adam’. Using the model on the testset, the results are as follows (Fig. 2):

After training, the LSTM model had an accuracy of 96.1%. Using the trainedLSTM model on the 2.7 million tweets data set (from all over the Australia), there were37,889 predicted tweets. We then used this model on the 153,984 tweets data setobtained from Melbourne to obtain 18,775 predicted tweets.

4 Data Analysis

There were three main analysis undertaken. The first one used the SA2 as the partitioncriterion to analyze tweets data. The second analysis focused specifically on theMelbourne CBD (SA2). The last one focused on the McDonald’s fast food restaurantsusing both the tweet data sets and the SA2 partition criteria.

4.1 Visualization

Visualization of all data gives a general impression of the work as a whole. Thevisualization of all the collected fast food restaurants is shown in Fig. 3.

As seen, fast food restaurants are distributed across Melbourne evenly with theexception of the Melbourne CBD. There are 53 fast food restaurants located in theMelbourne LGA, and 32 fast food restaurants located in the Melbourne CBD (SA2).The fast food restaurants are obviously more concentrated in the Melbourne CBD. Thevisualization of all tweets predicted by the LSTM model are shown in Fig. 4.

Fig. 2. Part of the test result


4.2 SA2

SA2 areas are relatively small, so all SA2 areas located in the area [−37.5, 144.5] to[−38.5, 145.5] can be considered as a single area. There are 213 SA2 areas in thisregion. After processing the 762 fast food restaurants, the 37,889 and 18,775 predictedtweets, there were 160 areas that contained both the collect tweets and the collected fastfood restaurants. The obesity related data from AURIN also uses the SA2 partitioncriteria as follows (Table 2):

Fig. 3. Visualization of the collected fast food restaurants

Fig. 4. Visualization of all tweets

Table 2. Data separated into SA2 areas

Sa2_name Overweight Obesity Fastfood Tweet_old Tweet_new

Melbourne 4217.22 1478.83 32 893 10027Epping 6070.65 4651.30 11 20 4Narre Warren 5499.92 4484.96 11 7 21Dandenong 5802.12 5034.52 10 27 20Preston 6959.97 5138.12 10 22 52……

Ivanhoe 2660.90 1318.37 1 5 4Newport 3885.22 2476.24 1 9 16


Here tweet_old is based on a large collection of historic tweets from the MelbourneeResearch Group and tweet_new, the tweets collected between April–July 2017. Herethe Melbourne (SA2) can be regarded as a singularity due to the amount of data andactivity more generally. Removing the data from the Melbourne CBD, the scatter plotmatrix for the SA2s are shown in Fig. 5.

After processing, the linear regression of the parameters is shown in Fig. 6:

Fig. 5. Scatter plot matrix of data in SA2 level (without Melbourne (SA2))

Fig. 6. Linear regression of fastfood and tweet_old (without Melbourne)


From the figure above, we can see that the linear regression for the number of fastfood restaurant and the number of historic tweets (2011–2014) is positive.

From Fig. 7 we can see the linear regression of the number of fast food restaurantsand number of fast food related tweets (2017.4–2017.7) is positive.

From Fig. 8 we can see, the linear regressions of the number of fast food restaurantsand number of individuals that are overweight and/or obese from AURIN are also bothpositive. We use the Pearson Correlation Coefficient to calculate the similarity asshown below (Table 3).

As seen, the correlations of the four parameter-pairs are all positive. It is noted thatthe data of the Melbourne (SA2) has a huge influence on the result and it is notappropriate to simply remove this data. The following section considers the Mel-bourne CBD area explicitly.

Fig. 7. Linear regression of fastfood and tweet_new (without Melbourne)

Fig. 8. Linear regression of fastfood and overweight (without Melbourne)


4.3 SA2

There are 32 fast food restaurants and 3,813 tweets (2011–2014) identified in theMelbourne CBD. This area was divided into four segments as shown in Fig. 9.

Since AURIN does not have overweight data based on partition, only the number ofhistoric tweets (2011–2014) were taken into consideration. After processing, the datawas as follows (Table 4):

We consider a linear regression on these data:From Fig. 10 we can see that the linear regression of the number of fast food

restaurants and number of tweets (2011–2014) is positive. The Pearson CorrelationCoefficient is 0.83554998008. This represents a strong positive correlation between thenumber of fast food restaurant and the number of tweets.

Table 3. Pearson Correlation Coefficient in SA2 level

Parameter pair All data All data (without Melbourne)

num_ff and num_tweet_old 0.65388565626 0.092830718422num_ff and num_tweet_new 0.67563521684 0.061814196975num_ff and overweight 0.22830948095 0.28745296698num_ff and obesity 0.21536339828 0.34973838419

Fig. 9. Partitioning the Melbourne SA2

Table 4. Melbourne CBD data separated into four parts

num_ff num_tweet

16 3877 3275 984 72


4.4 McDonalds/Maccas Analysis

In this part, only the McDonald’s restaurants and the tweets containing the keyword of‘maccas’ or ‘mcdonalds’ are analyzed. The McDonald’s restaurants and the tweets forall of Melbourne are shown in (Fig. 11).

The visualization of the McDonald’s restaurants and the tweets in the MelbourneSA2 are shown in Fig. 12.

From the image above, we can see that about 80% tweets about McDonald’s aresent near a McDonald’s restaurants, and 20% tweets are sent some distance from aMcDonald’s restaurants. From the tweet points around the McDonald’s restaurant, wecan see that people are likely to post tweet about fast food when waiting for mealsand/or having meals.

Fig. 10. Linear regression of restaurants and tweets in four partitions of Melbourne SA2

Fig. 11. Visualization of the collected McDonald’s restaurants and tweets


5 Conclusions and Future Work

The research question explored in this paper is to establish the relationship betweenoverweight people and the number of fast food restaurants in a particular area. Thework focused on collecting data of fast food restaurant via the Google Maps API andthe Twitter API. By combining these data with other official data on health andwellbeing from the AURIN project, a strong correlation was observed, i.e. overweightpeople are more likely to live in an area with increased number of fast food restaurants.This seemingly obvious result is important nonetheless since social media data isubiquitous and can be used to provide broader information that is more real time innature that can/should inform policy on establishment of fast food outlets and/orimproved understanding of population health issues.

There are several parts of this work that could be improved in the future. Theperformance of the LSTM model can be improved. The parameters of the LSTM can beoptimized and a method that improves the keywords-based method, might also be usedto label the input training data. Other machine learning models may also be used toanalyze the data with support for cross-validation root mean square error approaches tofind the best model. Furthermore, this work assumes (naively) that the geographicinformation of the tweet is where the potentially overweight/obese person may livehowever they may simply be tourists/visitors and hence no inference should be made.Future work could factor in logic to determine whether a tweet is actually a resident ornon-resident based on the number of tweets sent over a given time period.

References

1. Hacker, J., Wickramasinghe, N., Durst, C.: Can health 2.0 address critical healthcarechallenges? Insights from the case of how online social networks can assist in combatting theobesity epidemic. Australas. J. Inf. Syst. 21, 1–17 (2017)

2. Jensen, P., Jensen, L., Brunak, S.: Mining electronic health records: towards better researchapplications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012)

Fig. 12. Visualization of the collected McDonald’s and tweets in the Melbourne SA2


3. Sinnott, R.O., Cui, S.: Benchmarking sentiment analysis approaches on the cloud. In: 2016IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS) (2016)

4. Hamed, A., Wu, X., Erickson, R., Fandy, T.: Twitter K-H networks in action: advancingbiomedical literature for drug search. J. Biomed. Inf. 56, 157–168 (2015)

5. Chae, B.: Insights from hashtag #supplychain and Twitter analytics: considering Twitter andTwitter data for supply chain practice and research. Int. J. Prod. Econ. 165, 247–259 (2015)

6. Corso, A.J., Alsudais, K.: GIS, Big Data, and a tweet corpus operationalized via naturallanguage processing. In: 21st Americas Conference on Information Systems (2015)

7. Kobayashi, S., Fujioka, T., Tanaka, Y., Inoue, M., Niho, Y., Miyoshi, A.: A geographicalinformation system using the Google Map API for guidance to referral hospitals. J. Med.Syst. 34(6), 1157–1160 (2009)

8. Putra, Y.A., Khodra, M.L.: A deep learning approach towards cross-lingual tweet tagging.In: 2016 International Conference on Data and Software Engineering (2016)

9. Lydecker, J.A., Cotter, E.W., Palmberg, A.A., Simpson, C., Kwitowski, M., White, K.,Mazzeo, S.E.: Does this Tweet make me look fat? A content analysis of weight stigma onTwitter. Eat. Weight Disord. 21(2), 229–235 (2016)

10. Anwar, M., Yuan, Z.: Linking obesity and tweets. In: Zheng, X., Zeng, D.D., Chen, H.,Leischow, S.J. (eds.) ICSH 2015. LNCS, vol. 9545, pp. 254–266. Springer, Cham (2016).https://doi.org/10.1007/978-3-319-29175-8_24


http://dx.doi.org/10.1007/978-3-319-29175-8_24

A Platform for Exploring Social Media Analytics of …...A Platform for Exploring Social Media...

Documents

Transcript of A Platform for Exploring Social Media Analytics of …...A Platform for Exploring Social Media...