Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

21
Location Clustering Peter Kamm Marcel Flores
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    2

Transcript of Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Page 1: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Location ClusteringLocation Clustering

Peter Kamm

Marcel Flores

Peter Kamm

Marcel Flores

Page 2: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

The Data SetThe Data Set

SessionsContains a collection of connections over the

course of a week

User ID, Start time, stop time, Tower ID

25 million lines!

SessionsContains a collection of connections over the

course of a week

User ID, Start time, stop time, Tower ID

25 million lines!

Page 3: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

...a little more...a little more

A tower location mapping Tower ID, Longitude, Latitude, Zip Code

Allows us to map to a real world location

Data set is not complete There are many towers we do not have a

location for

A tower location mapping Tower ID, Longitude, Latitude, Zip Code

Allows us to map to a real world location

Data set is not complete There are many towers we do not have a

location for

Page 4: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

ApplicationsApplications

Load balancing on the cell-phone networks themselves

Social Networking Integrate online social networks with the real

world Accounts for mobility and usage patterns

Load balancing on the cell-phone networks themselves

Social Networking Integrate online social networks with the real

world Accounts for mobility and usage patterns

Page 5: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

AnalysisAnalysis

See which locations are active at what times Where do people congregate? How strongly do they congregate?

Does the locations affect their usage Connection Duration

How does this map out into the physical world?

See which locations are active at what times Where do people congregate? How strongly do they congregate?

Does the locations affect their usage Connection Duration

How does this map out into the physical world?

Page 6: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Day and Night HotspotsDay and Night Hotspots

Now uses a proper qualitative metric Looks at all ratio of day to night (or night to

day, depending on which is larger) Rejected locations with <100 day or night sessions

Gives us a number >1 to rank strength of location

Daytime is defined as 4am to 4pm Day has more “very strong” hotspots

Now uses a proper qualitative metric Looks at all ratio of day to night (or night to

day, depending on which is larger) Rejected locations with <100 day or night sessions

Gives us a number >1 to rank strength of location

Daytime is defined as 4am to 4pm Day has more “very strong” hotspots

Page 7: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Day and Night RanksDay and Night RanksTop Day Hotspots

Tower ID Day Hits Night Hits Ratio

157 10384 211 49.213

66 13873 339 40.923

1246 3492 146 23.918

136 10255 1386 7.399

Top Night Hotspots

Tower ID Day Hits Night Hits Ratio

6445 167 937 5.611

2414 397 2018 5.083

10316 122 593 4.861

8910 116 563 4.853

Page 8: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Strength DistributionStrength DistributionDay - 4,479 total

Night - 10,812 total

Page 9: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Day and Night PlotsDay and Night Plots

Page 10: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Day/Night DurationsDay/Night Durations

Page 11: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Day Avg DurationsDay Avg Durations

Page 12: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

DurationsDurations

Day/night hotspots tend to exhibit similar patterns of usage

Longest connections during morning/evening commute

Urban towers get longer connections in mornings, residential neighborhoods get longer connections in evenings

Day/night hotspots tend to exhibit similar patterns of usage

Longest connections during morning/evening commute

Urban towers get longer connections in mornings, residential neighborhoods get longer connections in evenings

Page 13: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Physical LocationsPhysical Locations

Have to be done by hand, smaller sample Incomplete, do not have locations for all

towers For the highest ranked locations

Sadly the top 4 shown previously not in location data set!

In fact, none of the high-ratio day or night spots appear (until down to a ratio of <2)!

Have to be done by hand, smaller sample Incomplete, do not have locations for all

towers For the highest ranked locations

Sadly the top 4 shown previously not in location data set!

In fact, none of the high-ratio day or night spots appear (until down to a ratio of <2)!

Page 14: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Some Locations…Some Locations…

Tower 79 - Night Tower, 1.255 ratio Located in Englewood

Residential South Chicago

Not very strong ratio

Tower 79 - Night Tower, 1.255 ratio Located in Englewood

Residential South Chicago

Not very strong ratio

Page 15: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Tracing a UserTracing a User

Turns out, the data set was (maybe) rich enough to provide information on a per user level!

Followed the first 5000 users in the data set, ranked them based on activity

Considered the busiest (by hand) Compared to day/night ratio of each

location

Turns out, the data set was (maybe) rich enough to provide information on a per user level!

Followed the first 5000 users in the data set, ranked them based on activity

Considered the busiest (by hand) Compared to day/night ratio of each

location

Page 16: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Tracing a User: ResultsTracing a User: Results

User 1: Busiest at tower 24 (20,729)

Night tower with a 2.339 ratioBut the user accounts for over 99% of the tower traffic!

2nd Busiest at tower 1197 (3,660) Night tower with a 1.528 ratio Again accounts for 99% of traffic!

User 1: Busiest at tower 24 (20,729)

Night tower with a 2.339 ratioBut the user accounts for over 99% of the tower traffic!

2nd Busiest at tower 1197 (3,660) Night tower with a 1.528 ratio Again accounts for 99% of traffic!

Page 17: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Tracing a User: ResultsTracing a User: Results

User 5: Busiest at tower 258 (7,449)

Night tower with a 1.711 ratio (75% of traffic!) No location data

2nd Busiest at tower 309 (5,773) Night tower with a 1.765 ratio (only 60%…) Residential, Longview, Washington

User 5: Busiest at tower 258 (7,449)

Night tower with a 1.711 ratio (75% of traffic!) No location data

2nd Busiest at tower 309 (5,773) Night tower with a 1.765 ratio (only 60%…) Residential, Longview, Washington

Page 18: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Tracing a User: ResultsTracing a User: Results

Had to go to user 113 to get a more reasonable user

Busiest at tower 100 (1,602) Night tower at 1.207 ratio Not an unreasonable amount of traffic Solon, Iowa

Second busiest at 5045 (602) Night tower at 2.004

Had to go to user 113 to get a more reasonable user

Busiest at tower 100 (1,602) Night tower at 1.207 ratio Not an unreasonable amount of traffic Solon, Iowa

Second busiest at 5045 (602) Night tower at 2.004

Page 19: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Single-user DurationsSingle-user Durations

Page 20: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

HotspotsHotspots

Looking at certain user traces, seemed that certain users seemed to use the busiest towers the most

So are the busiest towers really seeing a lot of users, or a few very busy users? Analyzed the numbers of unique users that a

tower sees in a day

Looking at certain user traces, seemed that certain users seemed to use the busiest towers the most

So are the busiest towers really seeing a lot of users, or a few very busy users? Analyzed the numbers of unique users that a

tower sees in a day

Page 21: Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores.

Unique User DataUnique User Data

Count how many users a specific tower sees over the duration

Allows us to give an alternate ranking of the tower traffic Easily ignore points where a single user

accounts for the majority of a towers traffic

Actual data is forthcoming…

Count how many users a specific tower sees over the duration

Allows us to give an alternate ranking of the tower traffic Easily ignore points where a single user

accounts for the majority of a towers traffic

Actual data is forthcoming…