Local Business Data Analysis using Hadoop
-
Upload
ruchi-singh -
Category
Data & Analytics
-
view
132 -
download
14
Transcript of Local Business Data Analysis using Hadoop
![Page 1: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/1.jpg)
LOCAL BUSINESS DATA ANALYSIS
GROUP B
![Page 2: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/2.jpg)
TEAM MEMBERS HEMAMALINI MADHANGURU
MAHSA TAYER FARAHANI
RUCHI SINGH
YASHASWI ANANTH
![Page 3: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/3.jpg)
TABLE OF CONTENTS1. Introduction
2. Project Workflow
3. Data Specifications
4. Project Specifications
5. Data
6. Visualization
7. Sentiment Analysis
8. Geospatial Representation
9. Insights
10. Github
11. References
![Page 4: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/4.jpg)
INTRODUCTION Wide variety of information available about local business
Helps in understanding the performance of the Local Business
Derive insights from the customer reviews for the Local Business
Factors Responsible for the popularity of Local Business
![Page 5: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/5.jpg)
PROJECT WORKFLOW
![Page 6: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/6.jpg)
DATA SPECIFICATIONSDATA SOURCE FILE
SIZEFILE TYPE Rows Columns
https://s3.amazonaws.com/hipicdatasets/yelp_raw_fall_2016.csv
90MB CSV 334,335 108
https://docs.google.com/uc?id=0B9kspRX6SWaaMlRvREQ3NmUxOE0&export=download
85MB JSON 117,486 10
Data EngineeringRemoved Junk Data and Duplicate rowsRemoved NULL valuesFormatted JASON file and converted to CSVFormatted the data for date time columns
![Page 7: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/7.jpg)
PROJECT SPECIFICATIONS1. Cluster on BigInsight
2. Hive QL and Pig for query
3. Tableau for visualization
4. Excel 3D Maps for Geospatial representation
5. Azure for backup
![Page 8: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/8.jpg)
DATA Local Business Table
Reviews Table
![Page 9: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/9.jpg)
VISUALIZATIONS
![Page 10: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/10.jpg)
REVIEW COUNT FOR BUSINESS TYPES
![Page 11: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/11.jpg)
TOP BUSINESS IN THE SIX CATEGORIES
![Page 12: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/12.jpg)
REVIEW COUNT OF POPULAR SUB-CATEGORIES OF BUSINESS
![Page 13: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/13.jpg)
MAXIMUM REVIEWS
Maximum number of reviews made by unique user IDs over 10 years
Further text analysis of the reviews is required to investigate the authenticity of these reviews
![Page 14: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/14.jpg)
SENTIMENT ANALYSIS
![Page 15: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/15.jpg)
SENTIMENT ANALYSIS OF SERVICES CATEGORY
![Page 16: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/16.jpg)
POPULAR AND UNPOPULAR FOOD BUSINESS ATTRIBUTES
![Page 17: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/17.jpg)
reservation ambience wheelchair has tv wifi
top bottom top bottom top bottom top bottom top bottom
✖ ✖ ✔ ✖ ✔ ✖ ✔ ✖ free no
✔ ✖ ✖ ✔ ✔ ✖ ✖ ✖ free no
✔ ✖ ✔ ✖ ✖ ✖ ✖ ✖ free no
✖ ✖ ✖ ✖ ✔ ✖ ✖ ✖ no no
✔ ✖ ✔ ✖ ✖ ✖ ✔ ✔ free no
✔ ✖ ✖ ✔ ✔ ✖ ✔ ✖ free no
✔ ✔ ✔ ✔ ✔ ✖ ✔ ✖ free free
✔ ✖ ✔ ✖ ✔ ✔ ✔ ✔ no free
✔ ✔ ✔ ✔ ✖ ✖ ✔ ✔ free no
✔ ✖ ✔ ✖ ✔ ✖ ✖ ✖ free free
80% 20% 70% 40% 70% 10% 60% 30% 80% 30%
COMPARISON OF THE KEY ATTRIBUTES
![Page 18: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/18.jpg)
GEOSPATIAL REPRESENTATION
![Page 19: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/19.jpg)
GITHUB
https://github.com/shamaahsaa/Local_Business_DataAnalysis
![Page 20: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/20.jpg)
INSIGHTS
![Page 21: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/21.jpg)
INSIGHTS1. Food is the most popular category of Local Business based on the reviews
2. Las Vegas is the most popular city based on review count for Local Business in every category
3. Reservation, Ambience, Wifi are some of the main factors responsible for the popularity of food business
4. More than 60% of people in a city write positive reviews for Local Business
5. Around 250 reviews(maximum) were written by one reviewer in a span of 10 years
![Page 22: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/22.jpg)
REFERENCES
1. http://www.tableau.com
2. https://hortonworks.com/tutorials
3. Prof. Woo's Big Data Resource: instructional1.calstatela.edu/jwoo5/classes/2016/fall/cis5200/
![Page 23: Local Business Data Analysis using Hadoop](https://reader033.fdocuments.us/reader033/viewer/2022042515/58ac12b11a28ab33178b5e2d/html5/thumbnails/23.jpg)
THANK YOUQUESTIONS...