Skip to content

Latest commit

 

History

History
27 lines (18 loc) · 1.07 KB

File metadata and controls

27 lines (18 loc) · 1.07 KB

November 2020

tl;dr: VoxelNet + UKF for 3D detection and tracking in crowded urban scene.

Overall impression

H3D dataset includes 160 scenes, and 30k frames, at 2 Hz. Roughly 90 seconds each scene.

Really crowded scenes as H3D has roughly same number of people and vehicle.

Key ideas

  • Use Lidar SLAM to register multiple lidar scans to form a dense point cloud. Then static objects will only have to be labeled once instead of in a frame-by-frame fashion.
  • Camera is used to assist
    • Class annotation
    • 3D bbox verification after projection 3D bbox back to camera
  • The 2Hz annotation is propagated to 10 Hz with linear velocity model.
  • 3D detection with VexelNet and tracking with UKF.

Technical details

  • Calibration between GPS and Lidar are done with hand-eye calibration method.
  • Motion blur has to be corrected, using the method from LOAM.

Notes

  • Questions and notes on how to improve/revise the current work