Abstract:In recent years, as a crucial and fundamental task in applications such as autonomous driving, mobile robotics, and virtual reality, 3D object detection has received extensive attention from researchers in various fields. It aims to localize and classify objects of interest in 3D space and give the corresponding 3D bounding boxes, including the position, size, and orientation of objects, which provides the basic information for the subsequent understanding and perception of the 3D scene as well as planning and decision-making. Point clouds captured by LiDAR have become the most commonly used input data for 3D object detection due to their accurate 3D information and depth information. In this paper, the 3D object detection methods based on LiDAR point cloud with deep learning are reviewed, the characteristics and processing methods of point cloud are summarized, and several corresponding types of detection methods and multimodal fusion methods of point cloud and image are introduced. At the same time, this paper compares the performance of different methods and discusses the challenges and development trends of 3D object detection based on point cloud in the future.