Lessons learned with laser scanning point cloud management ... · laser scanning point cloud...
Transcript of Lessons learned with laser scanning point cloud management ... · laser scanning point cloud...
![Page 1: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/1.jpg)
Lessons learned with laser scanning point cloud management in Hadoop HBase Prof. Debra LaeferCenter for Urban Science + ProgressNew York UniversityJune 2018
![Page 2: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/2.jpg)
2
Laser scanning data
![Page 3: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/3.jpg)
3
Laser scanning data
2015 Dublin point cloud
• Spatial coverage : > 2 km2
• Number of points : > 1.4 billion points
• Size on disk : 30 GB in LAS format
• Precision : 3 cm
• Density : 300 points/m2
(horizontal)
Open-access: https://geo.nyu.edu/catalog/nyu_2451_38684
![Page 4: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/4.jpg)
4
SortedMap<RowKey, List<SortedMap<Column, List<Value, Timestamp>>>>1 2 3 4 5 6 7 8
(a) Low-level data storage structure in HBase
(Table, RowKey, Family, Column, Timestamp) Value
(b) A high-level view of HBase data structure
HBase – a distributed database
Apache HBase• Enable random access to data in the Hadoop Distributed
File System• Open-source implementation of Google’s Big Table• Is the database behind many Facebook services• HBase is: distributed, non-relational (aka NoSQL), key-
value based, column oriented
HBase’s underlying data structure
![Page 5: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/5.jpg)
5
Data models for point cloud management in HBase
Expectations:• Scalability (distributed)• Flexibility (schema-less)• Performance (due to parallelism)
4 data models:• 2 row-key arrangements: Dual
Hilbert code, and Single Hilbert code• 2 column structures: Grouped
Attributes and Separate Attributes4 data models
![Page 6: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/6.jpg)
6
Data ingestion
Data ingestion workflow
![Page 7: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/7.jpg)
7
Performance evaluation – Point queries
Point queries:• Model 3 is slowest; the
remaining models are comparable.
• More than 5 times faster than pgPointCloud
• All data models are scalable
• Difference between hot and cold queries is obvious
Hot point query response times
90M 365M 1420MData size:
Cold point query response times
![Page 8: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/8.jpg)
8
Performance evaluation – Range queries
Hot range query response times
90M 365M 1420MData size:
Cold range query response times
Range queries:• Model 4 outperforms all
other models• Model 3 is slowest• Difference with
pgPointCloud is less obvious
• All data models are scalable
![Page 9: Lessons learned with laser scanning point cloud management ... · laser scanning point cloud management in Hadoop HBase Prof. Debra Laefer Center for Urban Science + Progress New](https://reader034.fdocuments.us/reader034/viewer/2022042301/5ecc456ae2e77955c85a5c3b/html5/thumbnails/9.jpg)
9
Concluding remarks
• 4 data models were investigated for storage, indexing, and
querying point clouds in a distributed, non-relational database.
• All HBase data models were scalable, including the flat, one-
point-per-row models, which previously hit the scalability wall in
relational implementation.
• Separation of point attributes to take advantage of the
schemaless feature of HBase introduced some overheads to
both data consumption and querying costs.
• Model 4, which resembles Oracle’s SDO_PC and
PostgreSQL’s PCPATCH, appears to be the most performant
data model. Model 4 does not fully utilize HBase’s
advantageous features.