Complexity of iff. An (curved) edge: Vertices: Only
description
Transcript of Complexity of iff. An (curved) edge: Vertices: Only
Complexity of
iff. An (curved) edge: Vertices: Only
𝑞
Pankaj K. Agarwal, Boris Aronov, Sariel Har-Peled, Jeff M. Phillips, Ke Yi, and Wuzhou ZhangNearest-Neighbor Searching Under Uncertainty II
Model and Qualification Probability
Motivation
Prior Work Computing
Future Work
ACM SIGMOD–SIGACT–SIGART Symposium on PRINCIPLES OF DATABASE SYSTEMS (PODS 2013)
The PNN problem under the existential model• The non-zero NN definition does not make sense• Solutions here cannot be directly adapted
Nonzero NNsUncertain point : represented as a probability density function
in : the pdf of 𝑞: any given query point : the pdf of : the cdf of
The qualification probability
Data location is imprecise…• Sensor databases• Face recognition• Mobile data
What is the “nearest neighbor” of now?𝑞
Acknowledgements
P. Agarwal and W. Zhang are supported by NSF under grants CCF-09-40671, CCF-10-12254, and CCF-11-61359, by ARO grants W911NF-07-1-0376 and W911NF-08-1-0452, and by an ERDC contract W9132V-11-C-0003. B. Aronov is supported by NSF grants CCF-08-30691, CCF-11-17336, and CCF-12-18791, and by NSA MSP Grant H98230-10-1-0210. S. Har-Peled is supported by NSF grants CCF-09-15984 and CCF-12-17462.
Nonzero NNs. • in the case of disks: [Evans et al. 2008]• Voronoi-based heuristics [Zhang et al. 2013] Computing • Best-effort based [Kriegel et al. 2007][Cheng et al.
2008] Other variants. • Expected Nearest Neighbor [Agarwal et al. 2012]• Superseding Nearest Neighbor [Yuen et al. 2010]• Top- NNs [Ljosa et al. 2007][Beskales et al. 2008]
Complexity of • if assuming general disks.• if pairwise disjoint disks of same radii.• if has locations.In all the cases,
where , and is the output size.
Monte Carlo methodThe number of instantiations is .If each has a discrete pdf of size :
, with probability at least Spiral Search method
Only need to look at a small number of closest points! Each has equally likely locations. 𝑘Estimate using closest points. 𝑚
Independent of !𝑛
Indexing schemes (using less space)• If each uncertainty region is a disk,
• If each has possible locations,
Two sub-problems: Nonzero NNs.
Nonzero Voronoi Diagram for any 𝒯⊆𝒫, Computing
Nearest Neighbor (NN) Searching
Post office
problem
𝑆: a set of points𝑞: any query point
Find the closest one
Probabilistic Nearest Neighbor (PNN)