Wang, JunLi, XiaolongSullivan, AlanAbbott, A. LynnChen, Siheng2023-02-282023-02-282022-0697816654873992160-7508http://hdl.handle.net/10919/114009We propose a point-based spatiotemporal pyramid architecture, called PointMotionNet, to learn motion information from a sequence of large-scale 3D LiDAR point clouds. A core component of PointMotionNet is a novel technique for point-based spatiotemporal convolution, which finds the point correspondences across time by leveraging a time-invariant spatial neighboring space and extracts spatiotemporal features. To validate PointMotionNet, we consider two motion-related tasks: point-based motion prediction and multisweep semantic segmentation. For each task, we design an end-to-end system where PointMotionNet is the core module that learns motion information. We conduct extensive experiments and show that i) for point-based motion prediction, PointMotionNet achieves less than 0.5m mean squared error on Argoverse dataset, which is a significant improvement over existing methods; and ii) for multisweep semantic segmentation, PointMotionNet with a pretrained segmentation backbone outperforms previous SOTA by over 3.3 % mIoU on SemanticKITTI dataset with 25 classes including 6 moving objects.Pages 4418-4427application/pdfenIn CopyrightPointMotionNet: Point-Wise Motion Learning for Large-Scale LiDAR Point Clouds SequencesConference proceeding2023-02-25IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshopshttps://doi.org/10.1109/CVPRW56347.2022.004882022-JuneAbbott, Amos [0000-0003-3850-6771]2160-7516