ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning.
-
Upload
carmella-quinn -
Category
Documents
-
view
218 -
download
0
Transcript of ACIS 1504 - Introduction to Data Analytics & Business Intelligence Text Mining Data Cleaning.
ACIS 1504 - Introduction to Data Analytics & Business Intelligence
Text MiningData Cleaning
Concept MapText Mining
Implementation
Mixed Cell References
Design: Accuracy
Random
Search, Left, Right, Mid,
Len, &
Paste Values
Objectives
• Define Text Mining
• Demonstrate Excel features that support text mining.
Segment A:Text Mining
Text Analytics / Text Mining
• Software that searches vast amounts of textual data (unstructured) identifying patterns.
Nestle• Nestle processes Social Media
http://uk.reuters.com/article/video/idUKBRE89P07Q20121026?videoId=238680321
Segment B:Text Functions
Text Mining
• Search
• Parse
• Concatenate
• SEARCH
• LEFT, MID, RIGHT, LEN
• &
Name Example
Open Grades Textfile.xlsx.
Divide Last Name, First Name into two separate columns.
1. Locate the comma (SEARCH)2. Extract all characters to left of comma (LEFT)3. Locate end of full name (LEN)4. Extract almost all characters between comma
and end of name (RIGHT)
SEARCH Function
LEFT Function
LEN or Length Function
RIGHT Function
MID FunctionExtract the first initial of first name.
Concatenate• Combine First Name, space and Last
Name.
• & is the concatenate symbol
• Quotes are required around constant strings of text
Student ID Example
Extract each student’s PID from their email address.
Create a new student identifier by combining the first three letters of the last name with the last four digits of the student ID number.
Segment C:Data Cleaning & Generation
Data Cleaning• Delete Unnecessary Columns & Rows• Resize Columns• Format Numeric Values• Separate Distinct Values • Shorten Lengthy Values• Data Validation for Future Entries• Generate Values
Favorite Pie Example
Favorite Pie Example
1. Ensure pie flavor data is consistent.
2. Replace confidential clicker ID # with randomly generated 6 digit number.
3. Ensure new ID number is static and unique.
Favorite Pie Example
Original Sorted Consistent
Random Number Functions
• =RAND()
• =RANDBETWEEN(low#, high#)
Paste Special - Values
MAC: Edit Menu, Paste Special
Exam Feedback Example
Open Exam Feedback.xlsx