IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is...

15
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Online Cruising Mile Reduction in Large-Scale Taxicab Networks Desheng Zhang, Student Member, IEEE, Tian He, Senior Member, IEEE, Shan Lin, Member, IEEE, Sirajum Munir, Student Member, IEEE, and John A. Stankovic, Fellow, IEEE Abstract—In the taxicab industry, a long-standing challenge is how to reduce taxicabs’ miles spent without fares, i.e., cruising miles. The current solutions for this challenge usually depend on passengers to actively provide their locations in advance for pickups. To address this challenge without the burden on passengers, in this paper, we propose a cruising system, pCruise, for taxicab drivers to find efficient routes to pick up passengers to reduce cruising miles. According to the real-time pick-up events from nearby taxicabs, pCruise characterizes a cruising process with a cruising graph, and assigns weights on edges of the cruising graph to indicate the utility of cruising corresponding road segments. Our weighting process considers the number of nearby passengers and taxicabs together in real-time, aiming at two scenarios where taxicabs are explicitly or implicitly coordinated with each other. Based on a weighted cruising graph, when a taxicab becomes vacant, pCruise provides a distributed online scheduling strategy to obtain and update an efficient cruising route with the minimum length and at least one arriving passenger. We evaluate pCruise based on a real-world GPS dataset from a Chinese city Shenzhen with 14, 000 taxicabs. The evaluation results show that pCruise assists taxicab drivers to reduce cruising miles by 42% on average. Index Terms—Taxicab Network, Dispatching, Cruising Mile Reduction. 1 I NTRODUCTION N OWADAYS, among all transportation modes, taxicabs play a particularly prominent role in residents’ daily commutes in many metropolitan ar- eas [1]. Based on a recent survey in New York C- ity [2], over 100 taxicab companies operate more than 13, 000 taxicabs, with a stable ridership of 660, 000 per day, and transport more than 25% of all pas- sengers, accounting for 45% of all paid transit fares. Unfortunately, to fulfill such delivery requests, these taxicabs travel a total of 800 million miles per year [1]. Nearly 40% of this mileage (i.e., 314 million miles per year, consuming 14 million gallons of gas) is spent to cruise for passengers, therefore leading to severely harmful tailpipe emissions and energy consumption. On the other hand, the top comment from passengers about taxicabs is “I cannot get one when needed” [2]. Additionally, in carbon emission trading under the Kyoto Protocol [3], governments will provide econom- ic incentives for achieving reductions in the carbon emissions. Therefore, to achieve less carbon emissions, better satisfactory services and more economic incen- tives, it is imperative to find a practical initiative to reduce cruising miles for taxicabs to quickly find D. Zhang and T. He are with the Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455. E-mail: {zhang, tianhe}@cs.umn.edu S. Lin is with the Department of Electrical and Computer Engi- neering, Stony Brook University, Stony Brook, NY 11794. E-mail: [email protected] S.Munir and J.A. Stankovic are with the Department of Computer Science, University of Virginia, Charlottesville, VA 22903. E-mail: {munir, stankovic}@cs.virginia.edu. passengers. As a promising initiative, in large metropolitan areas, e.g., New York City, Singapore, Beijing, and Shenzhen [4] [5] [6], taxicabs are equipped with G- PS and communication devices, and taxicabs’ status (locations, speeds, with passengers or not, etc) is uploaded to taxicab companies’ dispatching centers periodically. Based on the status, these dispatching centers schedule taxicabs to efficient routes to pick up nearby known passengers, thus reducing cruising miles. However, these solutions usually depend on passengers to actively provide their pickup locations in advance. Typically, existing dispatching centers function under the scenario where a passenger con- tacts a dispatching center first, and then a dispatch- ing center assigns a task about this passenger to a nearby vacant taxicab. But in developing counties or dense urban areas, most passengers hail taxicabs along streets directly, rather than booking taxicabs from dispatching centers in advance. Therefore, we face a challenge that how to dispatch taxicabs to quickly find passengers to reduce their cruising miles, yet without pickup locations provided by passengers. In this paper, we propose a cruising system, pCruise, i.e., cruising with purposes. The key novelty about pCruise is that it utilizes only several key GPS records of nearby taxicabs to model a cruising process about a particular taxicab, and then provides a dis- tributed online scheduling strategy with the shortest cruising route for this taxicab to find a passenger. Specifically, our key contributions are as follows: To the best of our knowledge, we conduct the first comprehensive study of how to reduce cruising

Transcript of IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is...

Page 1: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1

Online Cruising Mile Reduction in Large-ScaleTaxicab Networks

Desheng Zhang, Student Member, IEEE, Tian He, Senior Member, IEEE, Shan Lin, Member, IEEE,Sirajum Munir, Student Member, IEEE, and John A. Stankovic, Fellow, IEEE

Abstract—In the taxicab industry, a long-standing challenge is how to reduce taxicabs’ miles spent without fares, i.e., cruisingmiles. The current solutions for this challenge usually depend on passengers to actively provide their locations in advance forpickups. To address this challenge without the burden on passengers, in this paper, we propose a cruising system, pCruise,for taxicab drivers to find efficient routes to pick up passengers to reduce cruising miles. According to the real-time pick-upevents from nearby taxicabs, pCruise characterizes a cruising process with a cruising graph, and assigns weights on edgesof the cruising graph to indicate the utility of cruising corresponding road segments. Our weighting process considers thenumber of nearby passengers and taxicabs together in real-time, aiming at two scenarios where taxicabs are explicitly orimplicitly coordinated with each other. Based on a weighted cruising graph, when a taxicab becomes vacant, pCruise providesa distributed online scheduling strategy to obtain and update an efficient cruising route with the minimum length and at leastone arriving passenger. We evaluate pCruise based on a real-world GPS dataset from a Chinese city Shenzhen with 14, 000

taxicabs. The evaluation results show that pCruise assists taxicab drivers to reduce cruising miles by 42% on average.

Index Terms—Taxicab Network, Dispatching, Cruising Mile Reduction.

F

1 INTRODUCTION

NOWADAYS, among all transportation modes,taxicabs play a particularly prominent role in

residents’ daily commutes in many metropolitan ar-eas [1]. Based on a recent survey in New York C-ity [2], over 100 taxicab companies operate more than13, 000 taxicabs, with a stable ridership of 660, 000per day, and transport more than 25% of all pas-sengers, accounting for 45% of all paid transit fares.Unfortunately, to fulfill such delivery requests, thesetaxicabs travel a total of 800 million miles per year [1].Nearly 40% of this mileage (i.e., 314 million miles peryear, consuming 14 million gallons of gas) is spentto cruise for passengers, therefore leading to severelyharmful tailpipe emissions and energy consumption.On the other hand, the top comment from passengersabout taxicabs is “I cannot get one when needed” [2].Additionally, in carbon emission trading under theKyoto Protocol [3], governments will provide econom-ic incentives for achieving reductions in the carbonemissions. Therefore, to achieve less carbon emissions,better satisfactory services and more economic incen-tives, it is imperative to find a practical initiativeto reduce cruising miles for taxicabs to quickly find

• D. Zhang and T. He are with the Department of Computer Scienceand Engineering, University of Minnesota, Minneapolis, MN 55455.E-mail: {zhang, tianhe}@cs.umn.edu

• S. Lin is with the Department of Electrical and Computer Engi-neering, Stony Brook University, Stony Brook, NY 11794. E-mail:[email protected]

• S.Munir and J.A. Stankovic are with the Department of ComputerScience, University of Virginia, Charlottesville, VA 22903. E-mail:{munir, stankovic}@cs.virginia.edu.

passengers.As a promising initiative, in large metropolitan

areas, e.g., New York City, Singapore, Beijing, andShenzhen [4] [5] [6], taxicabs are equipped with G-PS and communication devices, and taxicabs’ status(locations, speeds, with passengers or not, etc) isuploaded to taxicab companies’ dispatching centersperiodically. Based on the status, these dispatchingcenters schedule taxicabs to efficient routes to pickup nearby known passengers, thus reducing cruisingmiles. However, these solutions usually depend onpassengers to actively provide their pickup locationsin advance. Typically, existing dispatching centersfunction under the scenario where a passenger con-tacts a dispatching center first, and then a dispatch-ing center assigns a task about this passenger to anearby vacant taxicab. But in developing countiesor dense urban areas, most passengers hail taxicabsalong streets directly, rather than booking taxicabsfrom dispatching centers in advance. Therefore, weface a challenge that how to dispatch taxicabs toquickly find passengers to reduce their cruising miles,yet without pickup locations provided by passengers.

In this paper, we propose a cruising system,pCruise, i.e., cruising with purposes. The key noveltyabout pCruise is that it utilizes only several key GPSrecords of nearby taxicabs to model a cruising processabout a particular taxicab, and then provides a dis-tributed online scheduling strategy with the shortestcruising route for this taxicab to find a passenger.Specifically, our key contributions are as follows:• To the best of our knowledge, we conduct the first

comprehensive study of how to reduce cruising

Page 2: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 2

6 1 2 1 8 T U E 6 1 2 1 8 W E D 6 1 2 1 8 T H U 6 1 2 1 8 F R I 6 1 2 1 8 S A T 6 1 2 1 8 S U N 6 1 2 1 8 M O N468

1 01 2

M O N 7 d a y s o f a D a y

Cruis

ing M

iles

Fig. 2. Weekly Cruising Miles

miles of taxicabs with a fully distributed onlinemethod. We design a system, called pCruise, tolet taxicab drivers find passengers with a cruisingroute based on pick-up events of nearby taxicabs,yet without locations provided by passengers.

• In pCruise, we create a mathematical model,called cruising graph, to capture key ingredients ofa taxicab’s cruising process. Vertices of the cruis-ing graph represent intersections and its edgesrepresent road segments connecting intersections.

• We characterize a cruising graph by weights onits edges, which are represented by the expectednumber of remaining passengers (obtained by thenumber of potential passengers minus the numberof competing taxicabs) during a taxicab cruisingroad segments. We employ the pickup eventsfrom nearby taxicabs to obtain the number ofpotential passengers; we present two methodsto calculate the number of competing taxicabsunder explicit and implicit coordinations amongtaxicabs. The formulation of weights in a cruisinggraph is based on only nearby taxicabs’ GPSrecords uploaded to dispatching centers, withoutadditional marginal costs.

• According to the weights on the cruising graph,we propose a fully distributed online schedulingfor pCruise. It selects a cruising route for a specif-ic taxicab by considering both the weight (i.e., ex-pected number of remaining passengers) on thisroute and the length of this route together. Fur-thermore, pCruise supports an online schedulingwhere a cruising route for a particular taxicab isupdated at every intersection according to newlyreceived GPS records from nearby taxicabs.

• We evaluate pCruise with a real-world half-yeardataset, collected from taxicabs in Shenzhen. Thedataset consists of 6 months of GPS records from14, 000 taxicabs. The evaluation results show thatwith the assistance of pCruise, a taxicab reducescruising miles by 42% on average.

The rest of the paper is organized as follows. Sec-tion 2 proposes the motivation based on our empiricalresults. Section 3 presents the pCruise design. Sec-tion 4 validates pCruise with a large-scale dataset.Section 5 presents the related work. Section 6 givesthe conclusion.

2 MOTIVATION

In this section, we employ the empirical data to showour motivation. In the taxicab business, the total travelmiles of a taxicab consist of live miles (driven withpaying passengers) and cruising miles (driven with-out paying passengers). The key inefficiency of taxicabnetworks is the long cruising miles for taxicabs to findpassengers, which leads to a high gas consumptionwithout any passengers. Based on a GPS dataset of14, 000 taxicabs in Shenzhen (with the highest peopledensity of 17, 510 per KM2 in China), we present theevidence about the inefficiencies of current taxicabnetworks.

Figure 1 summarizes statistics about taxicab net-works studied. In Figure 1, we observe that the aver-age cruising mile is 2.43 to find a passenger with anaverage live mile of 3.77 per taxicab per trip. Figure 2gives the average cruising mileage of all taxicabs inthe network on an hourly basis. We observe that thecruising miles are high in the morning, and typicallypeak at 7 AM around 11 miles per hour.

Collection Period 6 Months

Collection Date 01/01/12-06/30/12

Numbe of Taxis 14,453

Number of Passengers 98,472,628

Total Live Mile 371,269,642

Total Cruising Mile 238,788,860

Average Live Mile 3.77

Average Cruising Mile 2.43

Taxicab Network Summary

Fig. 1. Statistics

From the above two figures, we argue that suchhigh cruising miles contribute to the key inefficiencyof current taxicab networks, since long cruising milesor a high percentage of cruising indicate inefficientconsumption of gas. On the other hand, passengerswould select another transportation mode if they can-not find taxicabs in time. Thus, the key motivation ofthis paper is to reduce cruising miles, thus addressinga discord where taxicab drivers suffer from long cruis-ing miles, and passengers cannot find a taxicab whenthey need one. As follows, we bridge the above gapby proposing pCruise which guides vacant taxicabs tofind a passenger with the minimum cruising miles.

Page 3: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 3

3 pCruise DESIGN

In this section, we present the detailed design forpCruise. We introduce the current taxicab infrastruc-ture in Section 3.1. Built upon this infrastructure,we present the main idea of pCruise in Section 3.2.We demonstrate how to model a cruising process inSection 3.3. By this modeling, we explain how to char-acterize a cruising process in Section 3.4. Finally, weshow how to utilize a characterized cruising processto select cruising routes for taxicab drivers to reducecruising miles in Section 3.5.

3.1 Current InfrastructureIn current taxicab networks of large cities such asNew York City, Beijing, and Shenzhen, taxicabs areequipped with GPS and communication devices, inaddition to the traditional fare meters. With the G-PS devices, taxicabs log their physical status, suchas current locations, speeds, directions, etc; with thefare meters, taxicabs log their logical status, such aswith passengers or not. Thus, with communicationdevices, both physical and logical status of taxicabs isuploaded periodically to either a government agencyserver that stores the status of all taxicabs, or a taxicabcompany server that stores the status of the taxicabsbelonging to this company. The physical and logicalstatus of a taxicab is uploaded in terms of GPS record-s, which consist of the following parameters: (i) PlateNumber; (ii) Date and Time; (iii) GPS Coordinates;and (iv) Availability Bit: whether or not a passengeris in this taxicab when the record is being uploaded.Based on this infrastructure, we introduce semanticsmined from it, and how to configure the networkingof taxicabs as follows.

3.1.1 Semantics Mined from Current InfrastructureIn the above infrastructure, the only informationtaxicabs periodically uploaded is their GPS records.By observing the consecutive GPS records from thesame taxicab, we data mine passenger distributionsfrom them. Instead of considering all consecutive GPSrecords from a taxicab, we shall focus on the GPSrecords with a change on the availability bit comparedto the previous GPS records for the same taxicab.For example, if an availability bit turns to 1 from0 in two consecutive records of a taxicab i, then itindicates that this taxicab just picks up a passengerat the location indicated by the corresponding GPScoordinates. Therefore, we name this physical locationsi and the corresponding moment ti as a pick-up spotpi = (ti, si). Similarly, we name a physical location asa drop-off spot, if an availability bit turns to 0 from 1.Figure 3 gives an example of these spots within threeblocks. In this paper, taxicabs on roads are seen asprobing mobile sensors to detect passenger distribu-tions, which significantly assists vacant taxicab driversto find better routes to pick up new passengers.

Fig. 3. Pick-up (Red) or Drop-off (Blue) Spots

3.1.2 Networking under Current Infrastructure

Given the current infrastructure and the mined se-mantics, a straightforward dispatching is to establisha centralized dispatching center to dispatch taxicabsbased on mined passenger distributions to reducecruising miles for all taxicabs. But this centralizedsolution is usually infeasible in the real-world, sincetaxicabs in the same city may belong to more than100 taxicab companies [1]. So, sharing GPS recordsor dispatching taxicabs across different companies islogistically difficult to be performed in practice.

In this paper, utilizing the broadcast nature of wire-less uploading, we argue that by overhearing GPSrecords from other taxicabs, a vacant taxicab shallfind better routes to reduce cruising miles. A systembuilt upon this approach has the following potentialadvantages: (i) this approach is based on the currenttaxicab network infrastructure where it is mandatoryfor taxicabs to upload their records, so this approachdoes not involve additional marginal costs; (ii) thisapproach provides distributed coordinations for taxi-cab drivers to break the barriers between differentcompanies and operators.

The rationale between our distributed networking(i.e., GPS broadcasting and overhearing) is that for ataxicab, not all other taxicabs’ GPS records are usefulfor reducing cruising miles. For example, GPS recordsfrom another taxicab several miles away may notbe helpful for a taxicab to find a nearby passenger;similarly, if a taxicab tries to find a passenger now,it may also not be helpful to analyze GPS recordsreceived at several hours ago. Therefore, accordingto the current infrastructure, we envision that GPSrecords uploaded by a taxicab are overheard by othertaxicabs located within a certain range, indicated asa Broadcasting Range. A taxicab stores and analyzesthe GPS records from another nearby taxicab, as longas it is within nearby taxicabs’ broadcasting ranges.In addition to the broadcasting range, a broadcastingspeed, i.e., how often a taxicab uploads its GPS record-s, is also important, since a faster uploading speedenables nearby taxicabs to more quickly receive GPSrecords. We test effects of the GPS broadcasting rangeand speed on the system performance in Section 4.

Page 4: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 4

3.2 Main IdeaBased on the semantics and the network setting inthe last subsection, we first introduce some conceptsin the taxicab business, before presenting the mainidea. Figure 4 gives an example of a taxicab operatingscenario where its total travel miles are divided intocruising routes (driving without paying passengers)and live routes (driving with paying passengers) bypick-up or drop-off locations.

2 431

Cruising Route 1 Live Route 1

Pick-up Drop-off

6 875

Intersection

Cruising Route 2 Live Route 2 Cruising Route 3

Fig. 4. Taxicab Operating Scenario

In Figure 4, a taxicab starts its first cruising route 1at intersection 1 to find a passenger. At intersection 2,by picking up a passenger, the taxicab ends cruisingroute 1 and starts its live route 1. At intersection 4,by dropping off the passenger, the taxicab ends itslive route 1 and starts its cruising route 2, and soon and so forth. The process during which a drivercruises streets to find a passenger is called a cruisingprocess, which starts at the last drop-off location andends at the next pick-up location. The total miles of allcruising routes or live routes are called cruising milesor live miles, respectively. For example, supposingthat Figure 4 gives the entire operating scenario ofthis taxicab in a day, the total cruising miles are thesum of lengths of the three cruising routes in Figure 4.

In a live route, a taxicab is occupied by a passengerwho pays a fare according to the length of a liveroute; whereas in a cruising route, a taxicab is va-cant without paying passengers yet consuming gas.Therefore, a profitable strategy for a cruising processis to minimize the length of every cruising route, thusminimizing the total cruising miles, while finding at leastone expected passenger on every cruising route. Based onthis cruising strategy, the key idea of pCruise is asfollows.

During both cruising and live routes, as introducedin Section 3.1, a taxicab (i) broadcasts its GPS recordsas it did in the existing infrastructure, (ii) overhearsand stores the GPS records broadcasted by the taxi-cabs located within its broadcasting range, and (iii)deletes the GPS records overheard from the taxicabsmoving out of its broadcasting range.

In addition to the above operations, when a taxicabbecomes vacant and tries to find a new passenger(beginning of a cruising route), according to GPSrecords overheard from taxicabs within its broad-casting range, it (i) models its cruising process atroad segment levels, (ii) characterizes road segmentsin the model, (iii) selects a cruising route based oncharacterized road segments, and refines the selectedcruising route based on newly received GPS records.

Three questions rise up with respects to this idea.Based on the overall architecture of pCruise given in

Current Infrastructure

Characterization of Cruising Graph (Section 3.4)

Utilization of Cruising Graph (Section 3.5)

Broadcasting Range

Creation of Cruising Graph (Section 3.3)

GPS RecordsCruising

Graph

π4 π3

π1 Characterized

Cruising

Graphπ2

Road Map

# of Potential

Passenger# of Competing

Taxicab# of Remaining

Passenger (π)

Scheduling Online Re-scheduling

Identifying

Road Segments

Identifying

Intersections

Cruising Route Refined Cruising Route

Fig. 5. pCruise Architecture

Figure 5, we answer those questions as follows.

• How to model a cruising process: when a vacanttaxicab is cruising road segments, a driver onlytakes actions at an intersection. Thus, intersec-tions and road segments are crucial units for ourdesign. To capture fundamental characteristicsabout a cruising process, we propose a mathe-matical concept Cruising Graph. As in the first boxof Figure 5, a cruising graph is created based ona road map, as introduced in Subsection 3.3.

• How to characterize a road segment in a cruisingprocess: given a cruising graph, to show theutility of cruising a road segment, we characterizea cruising graph by assigning weights on itsedges, corresponding to road segments. As in thesecond box of Figure 5, weights are indicated bythe expected numbers of remaining passengers π,which are obtained by the number of potentialpassengers minus the number of competing taxicabs,as introduced in Subsection 3.4.

• How to select and refine a cruising route byutilizing characterized road segments: given acharacterized cruising graph, we propose an ef-ficient online scheduling to obtain the shortestcruising route at which at least one passengerarrives. As in the last box of Figure 5, given acharacterized cruising graph, a cruising route isscheduled at first, and then based on newly re-ceived GPS records, this cruising route is refinedrepeatedly via an online re-scheduling during theprocess, which is introduced in Subsection 3.5.

Note that to design pCruise with a generic philoso-phy, we describe pCruise as a distributed solution,

Page 5: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 5

although it can be easily customized to a centralizedsolution. Thus, pCruise serves as a lightweight yeteffective system which can be implemented in fron-tend (i.e., taxicabs) or in backend (i.e., dispatchingcenters). Further, to be compatible with the currentinfrastructures, pCruise is designed as a byproductwithout marginal costs, so there is no ad hoc directcommunications among taxicabs, and the only com-munications are uploading of GPS records.

Most importantly, given the existence of a plethoraof dispatching and recommendation systems, pCruiseserves as a middleware under any existing dispatch-ing system to enhance its performance. This is becausethe creation, the characterization, and the utilizationabout a cruising graph are fully independent fromeach other, and can be used to supplement othertheoretical models. As a result, pCruise provides aunified solution for a highly diverse, heterogeneousroute selection that may be deployed among mobileunits for various applications.

3.3 Creation of Cruising GraphIn this subsection, we model a taxicab cruising processwith a mathematical concept, i.e., Cruising Graphwhere for every intersection in the real-world, wecreate a corresponding vertex; for every two adjacentintersections, we create two directed edges betweenthem. Thus, the key step to create a cruising graphis to identify intersections and road segments con-necting adjacent intersections. Much work has beenproposed to use GPS traces to infer a road map [7] [8].Therefore, to focus on the system level, we assumethat a road map is given. Thus, the creation of cruis-ing graph is straightforward, since intersections andsegments can be easily identified by a road map.Note that though all segments are bidirectional, oneway situation is considered by assigning weight 0 tocorresponding edges. Figure 6 shows an example ofa cruising graph created according to a road map.

A

21 3 5 9

6 1042

7 11

8 12

Intersections as

Vertices

Road Segments

as Edges

Fig. 6. Cruising Graph3.4 Characterization of Cruising GraphFor a cruising graph in the last subsection, we char-acterize its edges (i.e., real-world road segments) withweights, i.e., numbers of Remaining Passengers. Forthe same edge, its weight varies for different taxicabsand time durations, which is based on the real-timepickup events from nearby taxicabs.

As in Figure 7, given the current time t0, for aspecific taxicab T1, we present how to obtain the

# of Arriving

Passenger

(λ×τ)# of Potential

Passengers

(κ=λ×τ×ρ)

# of Passengers Picked up

Competing Taxicabs

(ω)

# of Remaining Passengers

(π=κ-ω)

# of Passengers Picked up by

Original Taxicabs

(λ×τ×(1-ρ))

ρ: Potential Passenger Ratio λ:Passenger Arrival Rate τ:Travel Time

[Section 3.4.1]

[Section 3.4.2]

[Section 3.4.3]

Fig. 7. Overview of Characterization

number of remaining passengers πtr on a specific

segment r = [sbegin, send] during a future periodt = [tbegin, tend]. In the evaluation and real-wordsetting, tbegin and tend of a travel period on a segmentare decided by both the current location of a targettaxicab and the historical travel time for this segment.

For a better presentation, we omit the temporalsuperscript. First, assuming we are given a PassengerArrival Rate λr and a Travel Time τr for r during t,we have the number of Arriving Passengers as λr×τr.Second, we consider two kinds of vacant taxicabs thatwill be on r during t: (i) the vacant taxicabs (calledoriginal taxicabs) that are on r at t0; (ii) the vacant taxi-cabs (called competing taxicabs) that are not on r att0. Third, for all these arriving passengers on r duringt, we exclude the impacts of the original taxicabs bya Potential Passenger Ratio ρr, and thus we have thenumber of Potential Passengers as κr = λr × τr × ρr;further, we exclude the impacts of the competingtaxicabs by numbers of Competing Taxicabs ωr, andthus we have the number of Remaining Passengersas πr = κr − ωr, which is the key formula for ourweighting. The details of κr, ωr and πr are given inSubsection 3.4.1, 3.4.2, and 3.4.3.

3.4.1 Number of Potential Passengers κrAmong all arriving passengers λr × τr on r duringt, we obtain the number of potential passengers κrwith a potential passenger ratio ρr, which gives theprobability that there is a potential passenger (i.e., notserved by the original taxicabs already on r) in twodimensional temporal-spatial spot px = [tx, sx] wheretx ∈ [tbegin, tend] and sx ∈ [sbegin, send], as follows.

κr = λr × τr × ρr.

The rationale behind this equation is that for r, amongall λr × τr arriving passengers, some of them arepicked up by vacant taxicabs already on r (i.e., originaltaxicabs) at t0, and only the rest (with a number ofρr × λr × τr) is treated as the potential passengers forthe taxicabs (i.e., competing taxicabs) that will arriveon r during t. Since a travel time τr can be easilyestimated by the historical dataset, we present howto obtain ρr and λr as follows.

Page 6: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 6

Space

Timetj

sj

ti

sipi

pj

pk

tk

sk

oi

(sbegin, tbegin) tend

send

Fig. 8. ρ′ about 1 Taxi

Space

Timetj

sj

ti

sipi

pj

pk

tk

sk

oi

Fig. 9. ρ about 1 Taxi

Space

Timetj

sj

ti

sipi

pj

oj

oi

Fig. 10. ρ about 2 Taxis

Space

Timetj

sj

ti

si pi

pj

pksk

tk

oj

oi

ok

Fig. 11. ρ about 3 Taxis

Potential Passenger Ratio ρIn general, a road segment with a high potential

passenger ratio leads to a high possibility of pick-ups, thus minimizing cruising miles for a taxicabdriver. Therefore, a taxicab shall select a segmentwith a higher potential passenger ratio ρ, which isobtained by collected pick-up events in GPS records,via observing how long nearby vacant taxicabs pickedup passengers on this segment before. The sooner thenearby taxicabs pick up passengers on this segment,the higher the potential passenger ratio ρ on thissegment. Note that we utilize the collected pick-upevents in a previous period to infer ρ for a futureperiod. The assumption is that the potential passengerratio does not change dramatically in a short timeperiod during which a taxicab passed a segment.

For simplicity, when computing a weight on aroad segment, we transform a two-dimensional GPScoordinate (x, y) into an one dimensional variables ∈ [sbegin, send] on this road segment. Based on apick-up spot pi of Taxicab i, we map pi in a temporal-spatial Cartesian coordinate system in terms of a pick-up location si on a road segment and a pick-up mo-ment ti, as in Figure 8. The origin of coordinates oi is(tbegin, sbegin) where tbegin is the moment that Taxicabi enters this road segment; sbegin is the beginninglocation of this road segment.

In this two dimensional temporal-spatial system,the triangle-like shape 4′oisipi indicates a temporal-spatial area without any potential passenger. This isbecause if there were a potential passenger within4′oisipi, e.g., a spot pj = (tj , sj) in Figure 8, thenthe pick-up location of Taxicab i should be sj , insteadof si. But there could be a potential passenger outside4′oisipi, e.g., a spot pk = (tk, sk) in Figure 8, since atmoment tk, Taxicab i has already passed location sk,and cannot verify whether or not there is a passengerwaiting at location sk from moment tk.

Therefore, in this 1 taxicab scenario, for a roadsegment r connecting an intersection sbegin and anintersection send, we obtain the potential passengerratio ρ

′(1)r at a time period of [tbegin, tend] as follows.

ρ′(1)r = 1− |4′oisipi||tend − tbegin| × |send − sbegin|

,

where |4′oisipi| is the area of 4′oisipi. The physical

meaning of |tend − tbegin| × |send − sbegin| is the totaltwo dimensions in terms of time and space, while thephysical meaning of |4′oisipi| is the time and spacearea confirmed by Taxicab i that there is no potentialpassengers.

In the above analysis, we utilize the trajectory ofTaxicab i to obtain4′oisipi. However, since pCruise isdesigned to employ only a few pick-up spots insteadof all the trajectories, we approximate the triangle-likeshape 4′oisipi with the triangle 4oisipi in Figure 9.The rationale behind this approximation is that ina road segment we assume that a taxicab is withrelatively uniform speed. We verify the effect of thisapproximation on the evaluation section. After theapproximation, a new potential passenger ratio ρ

(1)r

for the road segment r as in Figure 9 is given by

ρ(1)r = 1− |4oisipi||tend − tbegin| × |send − sbegin|

,

where |4oisipi| is the area of 4oisipi.Figures 10 and 11 consider multiple taxicab sce-

narios where the triangles of taxicabs overlap witheach other. The union area of all triangles indicatesan area without any potential passengers in termsof space and time. For example, in Figure 10, whenTaxicab i enters the road segment from oi, Taxicabj has already been on this road segment at oj . Atmoment tj , Taxicab j first picks up a passenger atlocation sj . After that, at moment ti, Taxicab i picksup a passenger at location si. Since tj < ti and sj < si,4ojsjpj is inside of 4oisipi. Note that even thoughthere is a passenger at point pj that is inside 4oisipi,but this passenger is not a potential passenger forTaxicab i, since she or he is picked up by Taxicab j.

Similarly, for the 3 taxicabs scenario in Figure 11,

ρ(3)r = 1− |4oisipi ∪4ojsjpj ∪4okskpk||tend − tbegin| × |send − sbegin|

.

To generalize the above results, for an edge associ-ating the road segment r,

ρr = 1− | ∪∀i∈I 4oisipi||tend − tbegin| × |send − sbegin|

,

where I is a set of all corresponding pick-up spots. Inthe real-world setting, tbegin and tend are decided byboth the current location of the target taxicab and thehistorical travel time from sbegin to send.

Page 7: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 7

Space

Time

sj

ti

si pi

pj

Fj-1(si)oi

oj

Fig. 12. λ in Scenario 1

Space

Timetj

sj

ti

si

pi

pj

Fi-1(sj)oi

oj

Fig. 13. λ in Scenario 2

Space

Timetj

sj

ti

sipi

pj

oi

oj

Fig. 14. λ in Scenario 3

3

1

T1

2 4

T3

T2

3

1

T1

2 4

T3T2

Fig. 15. Competing Taxis

Passenger Arrival Rate λTo obtain an accurate passenger arrival rate, we

need a time period and the number of arriving pas-sengers during this period. But based on GPS recordsalone, it is impractical to obtain such accurate infor-mation. Thus, we introduce λ, which indicates thelower bound of the passenger arrival rate for segmentr = [sbegin, send] in duration t = [tbegin, tend]. Withλ, we have the lower bound of the total arrivingpassenger number at a time duration. As follows, wefirst introduce the lower bound of passenger arrivalrate λsi at a pick-up location si on r during t, which isobtained by the time difference of two vacant taxicabspassing the same location.

For example, in Figure 12, at moment Oi, Taxicabj is ahead of Taxicab i, but the pick-up location si ofTaxicab i is ahead of the pick-up location sj of Taxicabj. This indicates when Taxicab j arrives in location siat moment F−1j (si), there is no passenger (F−1j is theinverse function of line segment ojpj). This is becauseif there were a passenger, then the pick-up spot pj ofTaxicab j should be (F−1j (si), si), instead of (tj , sj).But since the pick-up spot of Taxicab i is pi = (ti, si),it indicates when Taxicab i arrives at location si, thereis a passenger in location si at moment ti. This impliesthat at least one passenger arrives at location si in timeduration [F−1j (si), ti]. Note that there may be morethan one passenger at location si arriving at duration[F−1j (si), ti], but since a taxicab can pick up only onepassenger at a time, we can only confirm that at leastone passenger arrived. This is the reason why λ is thelower bound of passenger arrival rate.

Therefore, the lower bound of the passenger arrivalrate at a pick up location si is given by

λsi =1

ti − F−1j (si).

The similar situation is in Figure 13 where at leastone passenger arrived at the location sj in time du-ration [F−1i (sj), tj ]. But in Figure 14, we observe asituation where λ cannot be obtained, because nopick-up happens when a later taxicab passes a formertaxicab’s cruising route.

Finally, the lower bound of the passenger arrivalrate at segment r during [tbegin, tend] is given by

λr = Σi∈Iλsi ,

where I is a set of all corresponding pick-up spots.

3.4.2 Number of Competing Taxicabs ωr

For a specific taxicab T1, there are two types of vacanttaxicabs competing with T1 on a road segment r whenT1 enters r: (i) vacant taxicabs (called original taxicab-s) are on r now, and will still be on r when T1 entersr. (ii) vacant taxicabs (called competing taxicabs) arenot on r now, but will be on r when T1 enters r.Figure 15 gives examples about them. Supposing thatat the current time (the top subfigure), taxicab T1,T2 and T3 is on road segment r1→2, r3→2, and r2→4,respectively. At the time when T1 enters r2→4 (thebottom subfigure), T3 will be still on r2→4, and T2has already entered r2→4. Thus, T3 is the originaltaxicab on r2→4 for T1, while T2 is the competingtaxicab on r2→4 for T1. In the previous subsection,we have excluded the effects of the original taxicabswith the potential passenger ratio ρr. Thus, in thissubsection we exclude the effects of the competingtaxicabs, which is shown by the number of competingtaxicabs (indicated as ωr) on a road segment r.

Note that in the current infrastructure introduced inSection 3.1, vacant taxicabs do not upload future cruis-ing routes to base stations (only uploading real-timeGPS locations). Thus, by overhearing, a taxicab cannotdirectly obtain the number of competing taxicabs ωr

on a road segment r. To solve this issue, we proposetwo methods to obtain ωr, according to explicit coor-dinations (cooperative) or implicit coordinations (non-cooperative) among taxicabs as follows.

Under the explicit coordination, vacant taxicabsare considered to be cooperative to each other, andexplicitly share their cruising routes with each otherin addition to GPS records. Thus, taxicabs coordinatewith each other by lowering weights of the edgeswith many vacant taxicabs. This cooperative scenari-o is applicable to the situation where all taxicabsbelong to the same taxicab company. Admittedly, ataxicab network may achieve a better performanceunder this explicit coordination, but it involves extracommunications in addition to the normal GPS recorduploading, and also may not be logistically possi-ble for the situation where taxicabs belong to manydifferent competing taxicab companies. In contrast,under the implicit coordination, vacant taxicabs areconsidered to be non-cooperative to each other, anddo not share their cruising routes with each other.Thus, a taxicab has to infer future cruising routes

Page 8: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 8

about nearby taxicabs based on their GPS records. Weintroduce these two kinds of coordinations as follows.Explicit Coordinations Among Taxicabs

For the explicit coordination, a vacant taxicab sharesits cruising routes to nearby taxicabs by additionalcommunications. For a vacant taxicab, its cruisingroute starts at its current location and includes severalroad segments connecting corresponding intersection-s. Thus, to share its cruising routes, a vacant taxicabsimply broadcasts a sequence of intersections in itscruising route to nearby taxicabs within its broadcast-ing range. To focus on the cruising system design,we configure this cruising route broadcasting togetherwith the GPS record uploading.

With the above explicit cruising route broadcasting,a taxicab T1 decides whether or not another taxicabT2 is on a specific road segment r when T1 enters r,based on the following information:(i) the current locations of T1 and T2;(ii) the cruising route of T2;(iii) the entering intersection and the exiting intersec-tion of road segment r;(iv) the time period [tT2

renter, tT2

rexit] from T2 entering r to

T2 exiting r;(v) the time tT1

renterwhen T1 enters r.

If tT1renter

is between tT2renter

and tT2rexit

, then T2 is onr, when T1 enters r. Similarly, we decide the samesituation for other nearby taxicabs.

Finally, the number of competing taxicabs ωr for ataxicab T1 is given by the following.

ωr =∑n∈N{Cr

n =

{1; tnrenter

< tT1renter

< tnrexit

0; otherwise} (1)

where N is the set of taxicabs that are within thebroadcasting range of taxicab T1.

For example, as in the top subfigure of Figure 16,a vacant taxicab T1 tries to calculate the expectednumber of competing taxicabs ωr for road segmentr2→3. Two other taxicabs T2 and T3 are within itsbroadcasting range, and have already selected theircruising routes shown by two dashed lines. Based onthe explicit coordinations, T2 and T3 broadcast theircruising routes as 6 → 3 → 4 and 2 → 3 → 4 withtheir GPS records, respectively. Thus, T1 has both thecruising routes and GPS records from T2 and T3. Then,T1 decides that T2 will not arrive at r2→3 based on thecruising route of T2. Next, T1 calculates (i) the timeperiod that T3 is on road segment r2→3, and (ii) thetime that T1 arrives at road segment r2→3. As shownby the bottom subfigure of Figure 16, T3 will still be onr2→3 when T1 arrives at road segment r2→3. Finally,T1 has Cr

T2= 0 and Cr

T3= 1, leading to ωr = 1 for

road segment r2→3.Note that during the cruising road segment r2→3,

ωr may change due to different situations, e.g., a newvacant nearby taxicab because of a new drop-off orthe entrance of the broadcasting range.

5

1 6T1

2 3 4

T2

T3

5

1 6T1

2 3 4

T2

T3

Fig. 16. Explicit Coordinations

Implicit Coordinations Among TaxicabsIn the implicit coordination, we envision a different

situation where all taxicabs are from different com-peting companies, and thus additional coordinatingcommunications are not allowed. Thus, a taxicab can-not directly obtain nearby taxicabs’ cruising routes.Therefore, we employ a method to infer cruisingroutes of nearby vacant taxicabs.

A naive method for a vacant taxicab T1 to infer thecruising route of another nearby vacant taxicab T2 isto assume that T2 uniformly selects a direction at allintersections. Obviously, this naive method overlooksthe fact that taxicab T2 also overhears the GPS recordsfrom nearby taxicabs as taxicab T1 does. If we envisionthat all vacant taxicabs in the network are cruisingbased on pCruise, then T1 uses its cruising graph(Section 3.3), weights on the edges (Section 3.4), anda scheduling algorithm (Section 3.5) to calculate thecruising route of T2 based on the current locationof T2. One key issue is that under the implicit co-ordination, T1 cannot obtain the expected number ofcompeting taxicabs ωr for a road segment r, since ωr isinherently depended on cruising routes. So T1 cannotobtain the weights on the edges. To address this issue,we let the number of potential passengers (κr) equalto the number of remaining passengers (πr) for anyroad segment r, assuming there are no competingtaxicabs (ωr = 0). Thus, T1 obtains T2’s cruising route.

After we obtain all cruising routes of nearby vacanttaxicabs, we use the same method as in Equation (1)for the explicit coordination to obtain the number ofcompeting taxicabs ωr for taxicab T1.

3.4.3 Number of Remaining Passengers πrSo far, we have introduced how to obtain the numberof potential passengers κ and the number of com-peting taxicabs ω in Subsection 3.4.1 and 3.4.2. Fora road segment r, considering the arriving competingtaxicabs (with number of ωr), only a subset of the totalpotential passengers (with number of κr) remains fora targeted taxicab. Thus, we define the weight on anedge r (corresponding to a real-world road segment r)as the number of remaining passengers πr as follows.

πr = κr − ωr = (λr × τr × ρr)− ωr.

Page 9: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 9

3.5 Utilization of Cruising GraphThe key objective of this subsection is based ona taxicab’s weighted cruising graph to schedule acruising route, which starts at its current locationand includes several following road segments andcorresponding intersections. When a taxicab becomesvacant, pCruise first performs an initial schedulingto obtain a cruising route. But during the taxicabcruises this route, pCruise updates the weights onits cruising graph based on the newly received GPSrecords. With the updated weights, pCruise performsan online rescheduling to obtain a new cruising routeat every intersection, until a passenger is picked up.The initial and online rescheduling are introduced inSection 3.5.1 and 3.5.2.

3.5.1 Initial SchedulingWhen a taxicab becomes vacant, given its currentcruising graph, we have many different routes withdifferent lengths and expected numbers of remainingpassengers, since no destination is provided. Amongall routes, our key objective is to find a cruising routethat (i) starts at the current location, (ii) has at leastone expected remaining passenger, and (iii) has theshortest length. The rationale behind this objectiveis that in the taxicab business, a profitable strategyfor a taxicab driver is to select a cruising route tofind at least one passenger with the shortest distancefrom the current location, i.e., minimizing the cruisingroute. Thus, we select a cruising route based on boththe total expected number of remaining passengerson this route (i.e., the total weight on edges of thisroute) and the total length of this route. For a selectedcruising route, the expected number of remainingpassengers on this cruising route is obtained by thesum of weights of the cruising route’s edges; the totallength of this cruising route is obtained by the sumof lengths of road segments in this route, which aregiven by the cruising graph.

Theoretically, our scheduling can be represented bythe following objective function based on a cruisinggraph G = (V,A) where xij indicates how many timesan edge (i.e., a road segment) ri→j with the length|ri→j | and weight πi→j is used in the optimal route.

min∑

(i,j)∈A

xij |ri→j |

s.t. xij ∈ {0, 1, 2, ...} ∀i, j ∈ V (2)∑(i,j)∈A

xijπi→j ≥ 1 (3)

∑j∈V

(xij − xji) ∈ {0, 1} if i = s (4)∑j∈V

(xij − xji) ∈ {0,−1} if i 6= s (5)

∑∀i,k∈V

∑j∈V

[(xij − xji) + (xkj − xjk)] ∈ {0,−1} (6)

where (2) ensures that one road segment can be usedmore than once in the optimal route, e.g., circlingaround a block with high expected remaining passen-gers; (3) ensures that the optimal route has at leastone expected remaining passenger; (4) ensures thatthe optimal route starts with the current intersection sand it may end at s (i.e., the outdegree

∑j xij is equal

to indegree∑

j xji) or not at s (i.e., the outdegree∑j xij is large than indegree

∑j xji by one); (5) and

(6) are about the connectivity of the optimal route: (5)ensures that an intermediate intersection has an inde-gree equal to its outdegree or that an end intersectionhas an indegree larger than its outdegree by one; (6)ensure that there is only one end intersection, i.e., notwo intersections have the indegrees larger than theoutdegrees by one.

Since solving the above problem with linear pro-gramming is NP-Hard, the running time increasesexponentially when the number of intersections in-creases. Although integer linear programming is suf-ficient for a small number of intersections, we aim topropose an efficient scheduling algorithm to reducethe running time yet still produce the optimal route.Note that traditional shortest path algorithms, e.g.,Dijkstra, would not be suitable for our scheduling,because they do not produce a solution that visitsthe current intersection s twice, which could be theoptimal solution of the scheduling.

Based on the objective of the scheduling, we utilizea traversal based strategy in terms of lengths of roadsegments. To find the optimal cruising route, i.e.,the shortest route with the total expected number ofremaining passengers larger than 1, a straightforwardsolution with linear programming is to find all theroutes with the total expected number of remainingpassengers larger than 1 by a complete search, andthen to select the shortest one among them. But itwill take a long time for a taxicab to find the optimalroute, although the number of vertices in a route islimited. Thus, to enable a real-time scheduling, weutilizes a hill-climbing based traversal that quicklyfinds a potential solution first and then continues totry other possible solutions to verify that the currentpotential solution is the optimal solution or not.

The key advantage of this scheme is that it canquickly stop to expand on a route to save the runningtime for an agile scheduling if this route will neverbe the optimal solution. This is performed by com-paring the expanding route with the current potentialsolution, and if this route is longer than the currentpotential solution we stop the expanding on this routeeven though the cumulative weight on this route issmaller than 1. This is because this route will neverbe the optimal solution, since the current potentialsolution is shorter than it and also has the total weightlarger than 1. Specifically, during a traversal, the hill-climbing scheme always greedily expands the vertexon the current shortest route, and stops expanding

Page 10: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 10

on a route if the total expected number of remainingpassengers of this route (i.e., the cumulative weight) islarger than 1. It replaces the current potential solutionwith a new route, if this route is shorter and hasthe cumulative weight is larger than 1. After expand-ing on all the possible routes, the current potentialsolution is the optimal solution for our scheduling,because it is the shortest route among all the routeswith cumulative weight larger than 1.

2

1

3

4 1

1 3

42

0.5/

2mi

0.4/1mi

(a) Cruising Graph (b) Initial Scheduling

0.2/1mi

0.2/3mi

0.1/3mi

0.3/

3mi

32

0.5/2mi 0.4/1mi

0.2/1mi

0.4/

1mi

0.2/3mi

0.5/

2mi

Fig. 17. Initial SchedulingIn Figure 17, we give an example about the initial

scheduling. A taxicab T1 becomes vacant at intersec-tion 1, and its weighted cruising graph is in subfigure(a) where both the weights (numbers of remainingpassengers) and the lengths (in terms of miles) of roadsegments are given in associated edges.(i) pCruise starts at intersection 1 where T1 is current-ly located, and the current road segment candidatesfor the traversal are r1→2 and r1→3. Since r1→3 isshorter than r1→2, pCruise first traverses r1→3.(ii) The current segment candidates for the traversalare r1→2 and r3→1 shown by subfigure (b). A traversalon r3→1 leads to a route r1→3 + r3→1 with the samelength of 1 + 1 = 2 miles, comparing to a traversalon r1→2 leading to a route r1→2. So we have to com-pare the expected numbers of remaining passengersbetween them. Because a traversal on r3→1 leads the0.4 + 0.2 = 0.6 expected passenger, more than the 0.5expected passenger obtained by a traversal on r1→2,pCruise secondly traverses r3→1.(iii) Based on the similar strategy, pCruise finds thefirst cruising route (r1→3 + r3→1 + r1→3) with thelength of 3 and the expected number of remainingpassengers larger than or equal to 1 (0.4+0.2+0.4=1) asthe current solution. pCruise continues to find otherpossible solutions, but pCruise will not traverse theroute with the length longer than the current solutionto save running time. This is because a route (e.g.,route r1→2 + r2→4 with length of 5) longer than thecurrent solution would not be the optimal solution.Finally, the route r1→3 + r3→1 + r1→3 is the optimalsolution, since other routes shorter than it do not havethe expected number of remaining passenger largerthan or equal to 1.

3.5.2 Online ReschedulingWhen a taxicab is cruising a selected route, severalpassengers may be picked up and some competing

taxicabs may arrive on the following road segmentsof this cruising route, which are indicated by GPSrecords of nearby taxicabs. These situations changeweights on road segments and thus change the attrac-tiveness of the corresponding cruising route. There-fore, when a taxicab is on a selected cruising route,pCruise employs an online rescheduling to agilelyselect a new cruising route at every intersection. Theobjective to reschedule a cruising route is the sameas in the initial scheduling: to find a route (i) startsat the current location, (ii) has at least one expectedremaining passenger, and (iii) has the shortest length.The algorithm for the online rescheduling is alsosimilar to the algorithm for the initial scheduling.

4 TRACE-DRIVEN EVALUATION

To test pCruise in a real-world scenario, we utilizea real-world dataset about 6 months of GPS tracesof more than 14000 taxicabs belonging to differenttaxicab companies in Shenzhen, a Chinese city with 10million population. The dataset is obtained by lettingevery taxicab upload its GPS records (the format as inSection 2) to report its traces to a base station. Basedon the dataset of GPS records, we obtain location andtime distributions of pick-up events, which are usedto evaluate the performance of pCruise.

The key performance metric is the percentage ofcruising miles in the total traveling miles. We eval-uate this metric on every one hour time windowof a day. In addition, we investigate the sensitivitiesof pCruise’s performance on two key parameters oftaxicab networks, i.e., the broadcasting speed as wellas the broadcasting range.

We evaluate the performance of pCruise underthree scenarios: one driver using pCuirse, the driversfrom one taxicab company using pCuirse (accountedfor 10% of total taxicabs); all drivers using pCuirse.We first report the results of only one taxicab and 10%of the total taxicabs using pCruise, where we manip-ulate the randomly selected taxicabs using pCruisewith routes decided by pick-up events and pCruisescheduling together, and manipulate other taxicabswith the original traces. It is difficult to accuratelymodel the impacts of the routes of the taxicabs withpCruise on the routes of the taxicabs without pCruise.Alternatively, to take the feedback of the system intoconsideration, we investigate the system performancewhere all taxicabs are using pCruise. In this scenario,we essentially create a simulator where every taxi-cab cruises streets based on pCruise (instead of theoriginal traces), and picks up or drops off passengersaccording to pick-up events (in terms of trips’ pick-up location, drop-off location, and beginning time)obtained by the GPS dataset. But the beginning timein a pick-up event is not the time when a passengerstarts to wait for taxicab, so for every passenger, weuse a random number w between 0 and 10 to show

Page 11: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 11

0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 401 02 03 04 05 06 07 08 09 0

1 0 0

Perce

ntage

of Cr

uising

Miles

(%)

2 4 H o u r s a D a y

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 18. Miles vs. Time (1 Taxi)

2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 4 2 6 2 8 3 01 01 21 41 61 82 02 22 42 62 83 0

Pe

rcenta

ge of

Cruis

ing M

iles (%

)

B r o a d c a s t i n g S p e e d ( r e c o r d / m i n s )

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 19. Miles vs. Speed (1 Taxi)

0 . 3 0 . 6 0 . 9 1 . 2 1 . 5 1 . 8 2 . 1 2 . 4 2 . 7 3 . 01 01 21 41 61 82 02 22 42 62 83 0

Perce

ntage

of Cr

uising

Miles

(%)

B r o a d c a s t i n g R a n g e ( K M )

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 20. Miles vs. Range (1 Taxi)

the waiting time in terms of minutes, where the 10-minutes waiting time as a upper bound is based ona taxicab survey [2]. In the simulation, if a taxicabarrives at a pick-up location within a w-minute timewindow before the corresponding beginning time,then this taxicab picks up the passenger, and thengoes to the same route as the taxicab that deliversthe same passenger in the original pick-up event.After arriving at the drop-off location, this taxicabcruises the cruising graph based on the scheduling ofpCruise, until arriving at the next pick-up location.

To show the effectiveness of pCruise, we first com-pare the performance of pCruise under the explicitcoordinations with Ground Truth, which is the origi-nal GPS traces from the dataset without any modifica-tions. Second, since pCruise only employs the pick-up spots, instead of the entire trajectories, to assignweights on a cruising graph, we also compare pCruisewith an Oracle scheme, which stores and processesall the collected trajectories of nearby taxicabs. TheOracle is used to verify the effect of the approximationabout the triangle-like shape in pCruise. Therefore,Oracle utilizes triangle-like shapes to calculate a moreaccurate edge weight based on the completed trajec-tories about nearby taxicabs as introduced in Figure 8.These two comparisons are given in Sections 4.1, 4.2and 4.3. Third, we compare two versions of pCruise(called pCruise-Explicit and pCruise-Implicit) underthe explicit and implicit coordinations as introducedin Section 3.4.2 to show the effectiveness of the explicitcoordinations in Section 4.4.

4.1 One Taxicab with pCruise

Figure 18 plots the percentage of cruising miles inevery one hour time window of a day. In the rushhour of a day, e.g., 06 : 00 − 10 : 00, the percentagesof cruising miles for all three schemes are below20%. In contrast, in the non-rush hour of a day, e.g.,00 : 00 − 6 : 00, the percentages of cruising milesfor all three schemes are over 60%. But for everyone hour time window of a day, both pCruise andOracle outperform Ground Truth with an averagegain of 35% and 42%, which verifies the effectivenessof taking nearby taxicab’s activities into consideration.Oracle outperforms pCruise by 14% on average in thenon-rush hour, and by 5% on average in the rush hour,indicating that even though in the rush hour Oracle

considers the entire trajectories, the effect is limited. Itimplies during the rush hour, the pick-up spots aloneused by pCruise effectively reduce the cruising miles,and thus considering the entire trajectories in the rushhour is not as effective as in the non-rush hour.

Figure 19 plots the effect of different broadcastingspeeds on the percentage of cruising miles. The 27%of the cruising miles for Ground Truth are obtained bythe average value of all day. We observe that with theincrease of the speed, the percentages of cruising milesin pCruise and Oracle decrease significantly as muchas 39% when the speed is larger than 6 records permin, but the decreasing slows down when the speed islarger than 12 records per min. This is because whenthe speed first becomes larger, pCruise and Oraclehave more frequently updated GPS records to obtainthe optimal route, but when the speed is larger than12 records per min, the records received by taxicabsare becoming redundant, and cannot be efficientlyused for route selections. We also observe that Oracleoutperforms pCruise slightly when the speed is low,e.g., 6 records per min.

Figure 20 plots the effect of different broadcastingranges on the percentage of cruising miles. We ob-serve that with the increase of ranges from 0.3KM to3.0KM, the percentages of cruising miles in pCruiseand Oracle continuously decrease. When the rangeis larger than 0.6KM, the decrease becomes moresignificant, but when the range is larger than 1.5KM,it becomes more stable. This is because when theranges become larger, pCruise and Oracle select aroute based on more nearby taxicabs’ information. Butwhen the range is larger than 1.5KM, the GPS recordsabout the far taxicabs may not provide any relatedinformation about nearby road segments. Thus, thedecrease of cruising miles becomes less obvious.

4.2 10% of the total taxicabs with pCruise

Figure 21 plots the performance of 10% of the totaltaxicabs in terms of the percentage of cruising miles.We observe that in the rush hour of a day, e.g., 06 : 00−10 : 00 or 16 : 00− 18 : 00, the percentages of cruisingmiles for all three schemes are below 20%. In contrast,in the non-rush hour of a day, e.g., 00 : 00− 6 : 00 orafter 22 : 00, the percentages of cruising miles for allthree schemes are significantly higher than these inthe rush hour. In addition, both pCruise and Oracle

Page 12: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 12

0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 401 02 03 04 05 06 07 08 09 0

1 0 0

Perce

ntage

of Cr

uising

Miles

(%)

2 4 H o u r s a D a y

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 21. Miles vs. Time (10% Taxi)

2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 4 2 6 2 8 3 01 01 21 41 61 82 02 22 42 62 83 0

Perce

ntage

of Cr

uising

Miles

(%)

B r o a d c a s t i n g S p e e d ( r e c o r d / m i n s )

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 22. Miles vs. Speed (10% Taxi)

0 . 3 0 . 6 0 . 9 1 . 2 1 . 5 1 . 8 2 . 1 2 . 4 2 . 7 3 . 01 01 21 41 61 82 02 22 42 62 83 0

Perce

ntage

of Cr

uising

Miles

(%)

B r o a d c a s t i n g R a n g e ( K M )

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 23. Miles vs. Range (10% Taxi)

0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 401 02 03 04 05 06 07 08 09 0

1 0 0

Perce

ntage

of Cr

uising

Miles

(%)

2 4 H o u r s a D a y

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 24. Miles vs. Time (All Taxi)

2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 4 2 6 2 8 3 081 01 21 41 61 82 02 22 42 62 83 0

Perce

ntage

of Cr

uising

Miles

(%)

B r o a d c a s t i n g S p e e d ( r e c o r d / m i n s )

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 25. Miles vs. Speed (All Taxi)

0 . 3 0 . 6 0 . 9 1 . 2 1 . 5 1 . 8 2 . 1 2 . 4 2 . 7 3 . 081 01 21 41 61 82 02 22 42 62 83 0

Perce

ntage

of Cr

uising

Miles

(%)

B r o a d c a s t i n g R a n g e ( K M )

G r o u n d T r u t h p C r u i s e O r a c l e

Fig. 26. Miles vs. Range (All Taxi)

outperform Ground Truth with an average gain of39% and 45%, which implies that when more taxicabstake other pick-up spots into consideration, the wholesystem has a better performance. Furthermore, Oracleoutperforms pCruise by 8%. This performance gaindecreases compared to the scenario where one taxicabuses pCruise, which implies that pCruise performsbetter when more taxicabs employ pCruise.

Figures 22 and 23 plot the effects of differentbroadcasting speeds and ranges on the percentageof cruising miles. In both figures, we observe thatwith the increases of broadcasting speeds and ranges,the percentage of cruising miles in Ground Truthkeeps the same, while those in pCruise and Oracledecrease significantly when the broadcasting speedand range is larger than 6 record/min and 0.75KM,respectively. Compared to the results in Figures 19and 20, the results in Figures 22 and 23 have thesimilar tendency yet the better performance. But thevalues from which the percentages of cruising milesdecrease significantly are smaller.

4.3 All taxicabs with pCruise

Figures 24, 25 and 26 plot the performance of alltaxicabs with pCruise. Figure 24 shows the similartendency, compared to Figure 21. In the rush hour ofa day, the percentages of cruising miles for all threeschemes are below 20%. In contrast, in the non-rushhour of a day, the percentages of cruising miles forall three schemes are higher. Further, both pCruiseand Oracle outperform Ground Truth with the similaraverage gains, compared to the situation where 10%of taxicabs using pCruise, which implies that whenall taxicabs are using pCruise, the performance is notsignificantly improved. In both Figures 25 and 26, wesee that with the increases of broadcasting speeds and

ranges, the percentage of cruising miles in GroundTruth has the similar tendency, while those in pCruiseand Oracle decrease when the broadcasting speed orrange is increasing from 2 records/min or 0.3KM,respectively.

0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 401 02 03 04 05 06 07 0

Perce

ntage

of Cr

uising

Miles

(%)

2 4 H o u r s a D a y

p C r u i s e - E x p l i c i t C o o r d i n a t i o n s p C r u i s e - I m p l i c i t C o o r d i n a t i o n s

Fig. 27. Impacts of Coordinations

4.4 Impacts of coordinations on pCruise

Figure 27 gives the impacts of two coordinations onthe performance of pCruise, i.e., the explicit coordi-nations and implicit coordinations as in Section 3.4.2.Figure 27 shows the similar tendency between twocoordinations, but the pCruise under explicit coordi-nations outperforms the pCruise under the implicitcoordinations, especially in the early morning. Thisis because a taxicab under the implicit coordina-tions cannot accurately predict the future route ofother taxicabs due to limited taxicab services in themorning. In contrast, in the rush hour of a day, thepercentages of cruising miles for pCruise under twocoordinations are similar, though the pCruise underexplicit coordinations still outperforms the pCruiseunder implicit coordinations by 10% on average.

5 RELATED WORK

The premise of GPS based taxicab dispatching isnot new, and many studies have focused on taxi-

Page 13: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 13

cab scheduling [9] [10] [11] [12] [13] or nov-el systems taking advantage of taxicab traces[4] [5] [6] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23].Two types of the previous work are directly relatedto our work, namely dispatching systems and recom-mendation systems.

5.1 Dispatching SystemsTaxicab dispatching systems are proposed along withthe development of intelligent transportation systemsand the popularization of GPS [24]. Currently, in themost of dispatching systems, a dispatching centerassigns a pick-up task to taxicab drivers according tothe nearest neighbor principle in terms of distance ortime. Chang et al. [25] propose a model that predictstaxicab demand distribution based on weather condi-tion, time and locations. Gonzalez et al. [26] computethe fastest route by taking into account the drivingpatterns of taxicabs, which are obtained from histor-ical GPS trajectories. Ziebart et al. [27] utilize GPStrajectories obtained from 25 taxicabs, instead of pro-viding the fastest route for drivers, aiming to predictthe destination of drivers. Phithakkitnukoon et al. [9]employ the naive Bayesian classifier with an error-based learning approach, which obtains the numberof vacant taxicabs at a given time and a location toenhance dispatching systems. Yang et al. [10] proposea model for urban taxicab services, which indicatesthe vacant and occupied taxicab movements as wellas the relationship between passengers and taxicabwaiting time. Yamamoto et al. [11] present an adaptiverouting scheme and a clustering scheme to enhancedispatching systems via assigning vacant taxicabs tothe locations with a high expected potential passengernumber adaptively. Liu et al. [12] propose a method todiscover spatio-temporal causal interactions in trafficdata streams. Huang et al. [13] present a system todetect the insufficient taxicab services in certain areasfor dispatching centers.

5.2 Recommendation SystemsRecommendation systems are proposed to provideuseful business intelligence for drivers, passengers, ortaxicab companies to maximize their benefits. Basedon GPS data from a metropolitan area, Zheng etal. [15] [16] [17] present several novel methods tomodel the transportation. Given the traces of taxicabs,Balan et al. [18] propose a system for passengersto inquire the travel time and taxicab fare. Wu etal. [28] present a system to assist mobile users to maketransportation decisions, such as taking a taxicab ornot. Wei et al. [19] employ uncertain trajectories toconstruct popular routes of taxicabs. Ge et al. [20]present a model to recommend a taxicab driver witha sequence of pick-up points to maximize profits viaa centralized solution. Powell et al. [21] propose anapproach to suggest profitable grid-based locations

for taxicab drivers by constructing a profitability mapwhere the nearby regions of drivers are scored servingas a metric for a taxicab driver decision makingprocess. Huang et al. [?] present a model to discoverspatial and temporal causal interactions of taxicab-s and thus to provide timely and efficient taxicabservices in certain disequilibrium areas, according topotential profits calculated by the historical taxicabdata. Yuan et al. [29] present a system to recommendtaxicab drivers to find nearby passengers.

5.3 Novelty of pCruiseCompared with the above systems, the novelty ofpCruise lies in three aspects. (i) pCruise does notrely on the assumption that passengers provide theirlocations, and automatically draws the passenger dis-tribution based on real-time status of nearby taxicabs.(ii) pCruise only employs “on/off” information, i.e.,pick-up or drop-off events, for the model. Therefore,we significantly reduce the size of data needed tobe processed, making pCruise suitable for the im-plementation on low-cost devices, e.g., smart phones.(iii) pCruise utilizes an efficient online schedulingbased on the events of their own nearby taxicabs,instead of the entire GPS dataset of all taxicabs. Sinceevery taxicab has different nearby taxicabs and start-ing times, pCruise introduces unpredictability andrandomness in the system to compensate for moretaxicabs heading to the same area with the samecruising route.

6 CONCLUSION

In this work, based on a half-year dataset, we designand evaluate pCruise, a cruising system to reducecruising miles for taxicab networks. Our work pro-vides a few valuable insights, which are hoped tobe useful for applying pCruise in a real-world taxi-cab network. Specifically, (i) for the different time ofthe day, a cruising system has different performancegains, and it is more efficient during the non-rushhour; (ii) a better networking setting to efficiently ex-change information among nearby taxicabs improvesthe performance of a cruising system; (iii) increasingthe number of taxicabs using the same cruising systemfurther improves the overall system performance; (iv)explicitly coordinating with nearby taxicabs by shar-ing the future routes increases the effectiveness of acruising system.

7 ACKNOWLEDGEMENTS

This research was supported in part by the USNational Science Foundation (NSF) grants CNS-0845994, CNS-0917097, CNS-1239226, CNS-1239108,CNS-1239483 and UMN Thesis Travel Grant. A pre-liminary work has been presented in IEEE RTSS2012 [30].

Page 14: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 14

REFERENCES

[1] Schaller Consulting, “The new york city taxicab fact book,”http://www.schallerconsult.com/taxi/taxifb.pdf.

[2] NYC Taxi Commission, “Taxi of tomorrow survey,” 2011.[3] Wikipedia, “Emissions trading,”

http://en.wikipedia.org/wiki/Emissionstrading.[4] J. Aslam, S. Lim, X. Pan, and D. Rus, “City-scale traffic

estimation from a roving sensor network,” in Proceedings of10th ACM Conference on Embedded Network Sensor Systems, ser.SenSys ’12.

[5] J. Yuan, Y. Zheng, and X. Xie, “Discovering regions of differ-ent functions in a city using human mobility and pois,” inProceedings of the 18th ACM SIGKDD international conference onKnowledge discovery and data mining, ser. KDD ’12, 2012.

[6] Y. Zheng, Y. Liu, J. Yuan, and X. Xie, “Urban computing withtaxicabs,” in Proceedings of the 13th international conference onUbiquitous computing, ser. UbiComp ’11, 2011.

[7] J. Biagioni and J. Eriksson, “Map inference in the face of noiseand disparity,” in Proceedings of the 20th International Conferenceon Advances in Geographic Information Systems, ser. SIGSPATIAL’12.

[8] X. Liu, J. Biagioni, J. Eriksson, Y. Wang, G. Forman, and Y. Zhu,“Mining large-scale, sparse gps traces for map inference:comparison of approaches,” in Proceedings of the 18th ACMSIGKDD international conference on Knowledge discovery and datamining, ser. KDD ’12.

[9] S. Phithakkitnukoon, M. Veloso, C. Bento, A. Biderman, andC. Ratti, “Taxi-aware map: Identifying and predicting vacanttaxis in the city,” in Proc. AMI, 2011.

[10] H. Yang, C. Fung, K. Wong, and S. Wong, “Nonlinear pricingof taxi services,” in Transportation Research Part A: Policy andPractic, 2010.

[11] K. Yamamoto, K. Uesugi, and T. Watanabe, “Adaptive rout-ing of cruising taxis by mutual exchange of pathways,” inKnowledge-Based Intelligent Information and Engineering Systems,2010.

[12] W. Liu, Y. Zheng, S. Chawla, J. Yuan, and X. Xing, “Discoveringspatio-temporal causal interactions in traffic data streams,” inProceedings of the 17th ACM SIGKDD international conference onKnowledge discovery and data mining, ser. KDD ’11, 2011.

[13] Y. Huang and J. W. Powell, “Detecting regions of disequilibri-um in taxi services under uncertainty,” in Proceedings of the 20thInternational Conference on Advances in Geographic InformationSystems, ser. SIGSPATIAL ’12, 2012.

[14] W. Zhang, S. Li, and G. Pan, “Mining the semantics of origin-destination flows using taxi traces,” in Proceedings of the 2012ACM Conference on Ubiquitous Computing, ser. UbiComp ’12,2012.

[15] Y. Zheng, Y. Chen, Q. Li, X. Xie, and W.-Y. Ma, “Understandingtransportation modes based on gps data for web applications,”ACM Trans. Web, vol. 4, no. 1, Jan. 2010.

[16] Y. Zheng, L. Liu, L. Wang, and X. Xie, “Learning transportationmode from raw gps data for geographic applications on theweb,” in Proceedings of the 17th international conference on WorldWide Web, ser. WWW ’08, 2008.

[17] Y. Zheng, Q. Li, Y. Chen, X. Xie, and W.-Y. Ma, “Understandingmobility based on gps data,” in Proceedings of the 10th inter-national conference on Ubiquitous computing, ser. UbiComp ’08,2008.

[18] R. K. Balan, K. X. Nguyen, and L. Jiang, “Real-time tripinformation service for a large taxi fleet,” in Proceedings ofthe international conference on Mobile systems, applications, andservices, ser. MobiSys ’11.

[19] L.-Y. Wei, Y. Zheng, and W.-C. Peng, “Constructing popularroutes from uncertain trajectories,” in Proceedings of the inter-national conference on Knowledge discovery and data mining, ser.KDD ’12.

[20] Y. Ge, H. Xiong, A. Tuzhilin, K. Xiao, M. Gruteser, and M. Paz-zani, “An energy-efficient mobile recommender system,” inProceedings of the 16th ACM SIGKDD international conference onKnowledge discovery and data mining, ser. KDD ’10, 2010.

[21] J. Powell, Y. Huang, F. Bastani, and M. Ji, “Towards reduc-ing taxicab cruising time using spatio-temporal profitabilitymaps,” in Proceedings of the 12th International Symposium onAdvances in Spatial and Temporal Databases, 2011.

[22] X. Liu, J. Biagioni, J. Eriksson, Y. Wang, G. Forman, and Y. Zhu,“Mining large-scale, sparse gps traces for map inference:comparison of approaches,” in Proceedings of the internationalconference on Knowledge discovery and data mining, ser. KDD ’12.

[23] J. Biagioni and J. Eriksson, “Map inference in the face of noiseand disparity,” in Proceedings of the 20th International Conferenceon Advances in Geographic Information Systems, ser. SIGSPATIAL’12.

[24] D. Lee, H. Wang, R. Cheu, and S. Teo, “Taxi dispatch systembased on current demands and real-time traffic conditions,” inJournal of the Transportation Research Board, 2004.

[25] H.-w. Chang, Y. Tai, and J. Y. Hsu, “Context-aware taxi de-mand hotspots prediction,” Int. J. Bus. Intell. Data Min., vol. 5,no. 1, Dec. 2010.

[26] H. Gonzalez, J. Han, X. Li, M. Myslinska, and J. P. Sondag,“Adaptive fastest path computation on a road network: atraffic mining approach,” in Proceedings of the 33rd internationalconference on Very large data bases, ser. VLDB ’07, 2007.

[27] B. D. Ziebart, A. L. Maas, A. K. Dey, and J. A. Bagnell,“Navigate like a cabbie: probabilistic reasoning from observedcontext-aware behavior,” in Proceedings of the 10th internationalconference on Ubiquitous computing, ser. UbiComp ’08, 2008.

[28] W. Wu, W. S. Ng, S. Krishnaswamy, and A. Sinha, “To taxi ornot to taxi? - enabling personalised and real-time transporta-tion decisions for mobile users,” in Proceedings of the 2012 IEEE13th International Conference on Mobile Data Management, ser.MDM ’12.

[29] J. Yuan, Y. Zheng, X. Xie, and G. Sun, “Driving with knowl-edge from the physical world,” in Proceedings of the internation-al conference on Knowledge discovery and data mining, ser. KDD’11.

[30] D. Zhang and T. He, “pcruise: Reducing cruising miles fortaxicab networks,” in IEEE 33rd Real-Time Systems Symposium(RTSS), 2012, pp. 85–94.

Desheng Zhang (M’10) is a Ph.D studentin the Department of Computer Science andEngineering at the University of Minnesota-Twin City. His research includes big data ana-lytics, mobile CPS, wireless sensor networks,intelligent transportation systems. He is amember of the IEEE.

Tian He (M’03-SM’12) is an associate pro-fessor with the Department of Computer Sci-ence and Engineering, University of Min-nesota Twin Cities. He is the coauthor ofmore than 100 papers in premier journalsand conferences with more than 12,000 ci-tations. As a recipient of the US NSF CA-REER Award’ 09, he served a few programchair position in international conferences.His research includes wireless sensor net-works, intelligent transportation systems, and

distributed systems. He is a member of the IEEE.

Page 15: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …dz220/paper/pCruiseRevision.pdfage cruising mile is 2:43 to find a passenger with an average live mile of 3:77 per taxicab per trip.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 15

Shan Lin (M’03) is an assistant professorwith the Department of Electrical and Com-puter Engineering in Stony Brook University.He received his PhD in computer scienceat the University of Virginia. His research isin the area of networked systems, with anemphasis on feedback control based designin cyber physical systems. He works on wire-less network protocols, medical devices, firstresponder systems, and smart transportationsystems. He is a member of the IEEE.

Sirajum Munir (M’13) is a Ph.D student inthe Computer Science Department at theUniversity of Virginia. His research interestlies in the area of cyber physical systems,ubiquitous computing, and data mining. Hehas been working in several projects involv-ing smart home management, human-in-the-loop systems, and routing protocol design forreal-time applications. He is a member of theIEEE.

John A. Stankovic (F’94) received the PhDdegree from Brown University. He is the BPAmerica professor in the Computer ScienceDepartment at the University of Virginia. Inthe past, he served as chair of the Depart-ment for eight years. He also won the IEEEReal-Time Systems Technical CommitteesAward for Outstanding Technical Contribu-tions and Leadership. He also won the IEEETechnical Committee on Distributed Process-ings Distinguished Achievement Award (in-

augural winner). He has won six Best Paper awards including forACM SenSys 2006. He was the editor-in-chief for IEEE Transactionson Distributed and Parallel Systems. His research interests are incyber physical systems, distributed computing, real-time systems,and wireless sensor networks. He is a fellow of the IEEE.