This repository contains my paper reading notes on deep learning and machine learning. It is inspired by Denny Britz and Daniel Takeshi.
New year resolution for 2020: read at least three paper a week and a high a high quality github repo a month!
The summary of the papers read in 2019 can be found [here on Towards Data Science](https://towardsdatascience.com/the-200-deep-learning-papers-i-read-in-2019-7fto fb7034f05f7?source=friends_link&sk=7628c5be39f876b2c05e43c13d0b48a3).
The sections below records paper reading activity in chronological order. See notes organized according to subfields here (up to 06-2019).
Here is a list of trustworthy sources of papers in case I ran out of papers to read.
The list of resource in this link talks about various topics in Autonomous Driving.
- simple-faster-rcnn-pytorch (2.1k stars) [Notes]
- YOLACT/YOLACT++ [2.1k stars]
- Yolov3 ultralytic [4.7k stars]
- MonoLoco [131 stars]
- A Baseline for 3D Multi-Object Tracking [548 stars]
- ROLO: recurrent YOLO
- point rend
- Carla data export
- openpilot
- 3D Lane Dataset
- MicroGrad
- OpenVSLAM (2.3k stars)
- ORB SLAM2 and Docker version
- PySLAM v2
- Monocular 3D Object Detection in Autonomous Driving — A Review
- Self-supervised Keypoint Learning — A Review
- Single Stage Instance Segmentation — A Review
- Self-paced Multitask Learning — A Review
- Convolutional Neural Networks with Heterogeneous Metadata
- Lifting 2D object detection to 3D in autonomous driving
- Multimodal Regression
- PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and Trajectory Prediction [Notes] ICCV 2019
- JAAD: Are They Going to Cross? A Benchmark Dataset and Baseline for Pedestrian Crosswalk Behavior ICCV 2017
- Pedestrian Action Anticipation using Contextual Feature Fusion in Stacked RNNs BMVC 2019
- Is the Pedestrian going to Cross? Answering by 2D Pose Estimation IV 2018
- Attentive Single-Tasking of Multiple Tasks CVPR 2019
- DETR: End-to-End Object Detection with Transformers [Notes] [FAIR]
- Transformer: Attention Is All You Need [Notes] NIPS 2017
- On the uncertainty of self-supervised monocular depth estimation CVPR 2020
- MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships [Notes] CVPR 2020 [Mono3D]
- SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation [Notes] CVPRW 2020 [Mono3D, Zongmu]
- PSDet: Efficient and Universal Parking Slot Detection IV 2020 [Zongmu]
- Towards Good Practice for CNN-Based Monocular Depth Estimation WACV 2020
- ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection AAAI 2020 oral [mono3D]
- End-to-End Lane Marker Detection via Row-wise Classification [Notes] [Qualcomm Korea, LLD as cls]
- Reliable multilane detection and classification by utilizing CNN as a regression network ECCV 2018 [LLD as reg]
- SUPER: A Novel Lane Detection System [Notes]
- Learning Lightweight Lane Detection CNNs by Self Attention Distillation ICCV 2019
- StixelNet: A Deep Convolutional Network for Obstacle Detection and Road Segmentation BMVC 2015
- StixelNetV2: Real-time category-based and general obstacle detection for autonomous driving [Notes] ICCV 2017 [DS]
- Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network [Notes] CVPR 2016 [channel-to-pixel]
- Car Pose in Context: Accurate Pose Estimation with Ground Plane Constraints [mono3D]
- D4LCN: Learning Depth-Guided Convolutions for Monocular 3D Object Detection [Notes] CVPR 2020
- Self-Mono-SF: Self-Supervised Monocular Scene Flow Estimation [Notes] CVPR 2020 [Stereo input]
- MEBOW: Monocular Estimation of Body Orientation In the Wild [Notes] CVPR 2020
- Online Depth Learning against Forgetting in Monocular Videos CVPR 2020
- Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume CVPR 2020
- GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose CVPR 2018
- Self-supervised Object Motion and Depth Estimation from Video CVPRW 2020
- Visual SLAM for Automated Driving: Exploring the Applications of Deep Learning
- Just Go with the Flow: Self-Supervised Scene Flow Estimation CVPR 2020 oral [Scene flow]
- Self-Supervised Deep Visual Odometry with Online Adaptation CVPR 2020 oral
- Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume CVPR 2020
- Visualization of Convolutional Neural Networks for Monocular Depth Estimation ICCV 2019 [monodepth]
- Egocentric Vision-based Future Vehicle Localization for Intelligent Driving Assistance Systems [Notes] [Honda] ICRA 2019
- PackNet: 3D Packing for Self-Supervised Monocular Depth Estimation [Notes] CVPR 2020 oral [Scale aware depth]
- PackNet-SG: Semantically-Guided Representation Learning for Self-Supervised Monocular Depth [Notes] ICLR 2020 [TRI, infinite-depth problem]
- TrianFlow: Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [Notes] CVPR 2020 [Scale aware]
- Understanding the Limitations of CNN-based Absolute Camera Pose Regression [Notes] CVPR 2019 [Drawbacks of PoseNet, MapNet]
- To Learn or Not to Learn: Visual Localization from Essential Matrices [Notes] ICRA 2020 [SIFT + 5 pt solver >> others for VO]
- DF-VO: Visual Odometry Revisited: What Should Be Learnt? [Notes] ICRA 2020 [Depth and Flow for accurate VO]
- D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry [Notes] CVPR 2020 oral [Daniel Cremer, TUM]
- Network Slimming: Learning Efficient Convolutional Networks through Network Slimming [Notes] ICCV 2017
- BatchNorm Pruning: Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers [Notes] ICLR 2018
- Direct Sparse Odometry PAMI 2018
- PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume CVPR 2018 oral [Optical flow]
- LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation CVPR 2018 [Optical flow]
- FlowNet: Learning Optical Flow With Convolutional Networks ICCV 2015 [Optical flow]
- Train in Germany, Test in The USA: Making 3D Object Detectors Generalize [Notes] CVPR 2020
- PseudoLidarV3: End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [Notes] CVPR 2020
- ATSS: Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection [Notes] CVPR 2020 oral
- Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression AAAI 2020
- Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation [Journal version]
- YOLOv4: Optimal Speed and Accuracy of Object Detection [Notes]
- CBN: Cross-Iteration Batch Normalization [Notes]
- Stitcher: Feedback-driven Data Provider for Object Detection [Notes]
- SKNet: Selective Kernel Networks [Notes] CVPR 2019
- CBAM: Convolutional Block Attention Module [Notes] ECCV 2018
- EfficientDet: Scalable and Efficient Object Detection CVPR 2020
- ResNeSt: Split-Attention Networks [Notes]
- ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst [Notes] RSS 2019 [Waymo]
- IntentNet: Learning to Predict Intention from Raw Sensor Data [Notes] CoRL 2018 [Uber ATG, perception and prediction, Lidar+Map]
- RoR: Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions [Notes] CVPR 2019 [Zoox]
- MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction [Notes] CoRL 2019 [Waymo, authors from RoR and ChauffeurNet]
- NMP: End-to-end Interpretable Neural Motion Planner [Notes] CVPR 2019 oral [Uber ATG]
- Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks [Notes] ICRA 2019 [Multimodal]
- Jointly Learnable Behavior and Trajectory Planning for Self-Driving Vehicles IROS 2019 Oral [Uber ATG, behavioral planning, motion planning]
- TensorMask: A Foundation for Dense Object Segmentation [Notes] ICCV 2019 [single-stage instance seg]
- BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation [Notes] CVPR 2020
- Mask Encoding for Single Shot Instance Segmentation [Notes] CVPR 2020 oral [single-stage instance seg, Chunhua Shen]
- PolarMask: Single Shot Instance Segmentation with Polar Representation [Notes] CVPR 2020 oral [single-stage instance seg]
- SOLO: Segmenting Objects by Locations [Notes] [single-stage instance seg, Chunhua Shen]
- SOLOv2: Dynamic, Faster and Stronger [Notes] [single-stage instance seg, Chunhua Shen]
- CondInst: Conditional Convolutions for Instance Segmentation [Notes] [single-stage instance seg, Chunhua Shen]
- VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition [Notes] ICCV 2017
- Which Tasks Should Be Learned Together in Multi-task Learning? [Notes] [Stanford]
- Multi-Task Learning as Multi-Objective Optimization NeurIPS 2018
- Taskonomy: Disentangling Task Transfer Learning [Notes] CVPR 2018
- Rethinking ImageNet Pre-training [Notes] ICCV 2019 [Kaiming He]
- UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor [Notes] [superpoint]
- KP2D: Neural Outlier Rejection for Self-Supervised Keypoint Learning [Notes] ICLR 2020 (pointNet)
- KP3D: Self-Supervised 3D Keypoint Learning for Ego-motion Estimation [Notes] [Toyota, superpoint]
- NG-RANSAC: Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses [Notes] ICCV 2019 [pointNet]
- Learning to Find Good Correspondences [Notes] CVPR 2018 Oral (pointNet)
- RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving [Notes] [Huawei, Mono3D]
- DSP: Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation [Notes] AAAI 2020 (SenseTime, Mono3D)
- Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks (LLD, LSTM)
- LaneNet: Towards End-to-End Lane Detection: an Instance Segmentation Approach [Notes] IV 2018 (LaneNet)
- 3D-LaneNet: End-to-End 3D Multiple Lane Detection [Notes] ICCV 2019
- Semi-Local 3D Lane Detection and Uncertainty Estimation [Notes] [GM Israel, 3D LLD]
- Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection [Notes] [Apollo, 3D LLD]
- Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty CVPR 2018 [Egocentric prediction]
- Associative Embedding: End-to-End Learning for Joint Detection and Grouping [Notes] NIPS 2017
- Pixels to Graphs by Associative Embedding [Notes] NIPS 2017
- Social LSTM: Human Trajectory Prediction in Crowded Spaces [Notes] CVPR 2017
- Online Video Object Detection using Association LSTM [Notes] [single stage, recurrent]
- SuperPoint: Self-Supervised Interest Point Detection and Description [Notes] CVPR 2018 (channel-to-pixel, deep SLAM, Magic Leap)
- PointRend: Image Segmentation as Rendering [Notes] CVPR 2020 Oral [Kaiming He, FAIR]
- Multigrid: A Multigrid Method for Efficiently Training Video Models [Notes] CVPR 2020 Oral [Kaiming He, FAIR]
- GhostNet: More Features from Cheap Operations [Notes] CVPR 2020
- FixRes: Fixing the train-test resolution discrepancy [Notes] NIPS 2019 [FAIR]
- VirtualCam: Single-Stage Monocular 3D Object Detection with Virtual Cameras [Notes] [Mapillary, Mono3D]
- Amodal Completion and Size Constancy in Natural Scenes [Notes] ICCV 2015 (Amodal completion)
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning [Notes] CVPR 2020 Oral [FAIR, Kaiming He]
- Double Descent: Reconciling modern machine learning practice and the bias-variance trade-of [Notes] PNAS 2019
- Deep Double Descent: Where Bigger Models and More Data Hurt [Notes]
- Visualizing the Loss Landscape of Neural Nets NIPS 2018
- The ApolloScape Open Dataset for Autonomous Driving and its Application CVPR 2018 (dataset)
- ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving [Notes] CVPR 2019
- Part-level Car Parsing and Reconstruction from a Single Street View [Notes] [Baidu]
- 6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images [Notes] CVPR 2019
- RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving [Notes]
- DORN: Deep Ordinal Regression Network for Monocular Depth Estimation [Notes] CVPR 2018
- D&T: Detect to Track and Track to Detect [Notes] ICCV 2017 (from Feichtenhofer)
- CRF-Net: A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection [Notes] SDF 2019 (radar detection)
- RVNet: Deep Sensor Fusion of Monocular Camera and Radar for Image-based Obstacle Detection in Challenging Environments [Notes] PSIVT 2019
- RRPN: Radar Region Proposal Network for Object Detection in Autonomous Vehicles [Notes] ICIP 2019
- ROLO: Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking [Notes] ISCAS 2016
- Recurrent SSD: Recurrent Multi-frame Single Shot Detector for Video Object Detection [Notes] BMVC 2018 (Mitsubishi)
- Recurrent RetinaNet: A Video Object Detection Model Based on Focal Loss [Notes] ICONIP 2018 (single stage, recurrent)
- Actions as Moving Points [Notes] [not suitable for online]
- The PREVENTION dataset: a novel benchmark for PREdiction of VEhicles iNTentIONs [Notes] ITSC 2019
- Semi-Automatic High-Accuracy Labelling Tool for Multi-Modal Long-Range Sensor Dataset [Notes] IV 2018
- Astyx dataset: Automotive Radar Dataset for Deep Learning Based 3D Object Detection [Notes] EuRAD 2019 (Astyx)
- Astyx camera radar: Deep Learning Based 3D Object Detection for Automotive Radar and Camera [Notes] EuRAD 2019 (Astyx)
- How Do Neural Networks See Depth in Single Images? [Notes] ICCV 2019
- Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera ICRA 2019 (depth completion)
- DC: Depth Coefficients for Depth Completion [Notes] CVPR 2019 [Xiaoming Liu, Multimodal]
- Parse Geometry from a Line: Monocular Depth Estimation with Partial Laser Observation [Notes] ICRA 2017
- PointPainting: Sequential Fusion for 3D Object Detection (nuscenece)
- VO-Monodepth: Enhancing self-supervised monocular depth estimation with traditional visual odometry [Notes] 3DV 2019 (sparse to dense)
- Probabilistic Object Detection: Definition and Evaluation [Notes]
- The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation [Notes] ICCV 2019
- On Calibration of Modern Neural Networks [Notes] ICML 2017 (Weinberger)
- Extreme clicking for efficient object annotation [Notes] ICCV 2017
- Radar and Camera Early Fusion for Vehicle Detection in Advanced Driver Assistance Systems [Notes] NeurIPS 2019 (radar)
- Deep Active Learning for Efficient Training of a LiDAR 3D Object Detector [Notes] IV 2019
- C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion [Notes] ICCV 2019
- YOLACT: Real-time Instance Segmentation [Notes] ICCV 2019 [single-stage instance seg]
- YOLACT++: Better Real-time Instance Segmentation [single-stage instance seg]
- Review of Image and Feature Descriptors
- Vehicle Detection With Automotive Radar Using Deep Learning on Range-Azimuth-Doppler Tensors [Notes] ICCV 2019
- GPP: Ground Plane Polling for 6DoF Pose Estimation of Objects on the Road [Notes] IV 2020 [UCSD, Trevidi, mono 3DOD]
- MVRA: Multi-View Reprojection Architecture for Orientation Estimation [Notes] ICCV 2019
- YOLOv3: An Incremental Improvement
- Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving [Notes] ICCV 2019 (Detection with Uncertainty)
- Bayesian YOLOv3: Uncertainty Estimation in One-Stage Object Detection [Notes] [DriveU]
- Towards Safe Autonomous Driving: Capture Uncertainty in the Deep Neural Network For Lidar 3D Vehicle Detection [Notes] ITSC 2018 (DriveU)
- Leveraging Heteroscedastic Aleatoric Uncertainties for Robust Real-Time LiDAR 3D Object Detection [Notes] IV 2019 (DriveU)
- Can We Trust You? On Calibration of a Probabilistic Object Detector for Autonomous Driving [Notes] IROS 2019 (DriveU)
- LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving [Notes] CVPR 2019 (uncertainty)
- LaserNet KL: Learning an Uncertainty-Aware Object Detector for Autonomous Driving [Notes] [LaserNet with KL divergence]
- IoUNet: Acquisition of Localization Confidence for Accurate Object Detection [Notes] ECCV 2018
- gIoU: Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression [Notes] CVPR 2019
- KL Loss: Bounding Box Regression with Uncertainty for Accurate Object Detection [Notes] CVPR 2019
- CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth [Notes] CVPR 2019
- BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors [Notes]
- TW-SMNet: Deep Multitask Learning of Tele-Wide Stereo Matching [Notes] ICIP 2019
- Accurate Uncertainties for Deep Learning Using Calibrated Regression [Notes] ICML 2018
- Calibrating Uncertainties in Object Localization Task [Notes] NIPS 2018
- SMWA: On the Over-Smoothing Problem of CNN Based Disparity Estimation [Notes] ICCV 2019 [Multimodal, depth estimation]
- Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image [Notes] ICRA 2018 (depth completion)
- Review of monocular object detection
- Review of 2D 3D contraints in Mono 3DOD
- MonoGRNet 2: Monocular 3D Object Detection via Geometric Reasoning on Keypoints [Notes] [estimates depth from keypoints]
- Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image [Notes] CVPR 2017
- SS3D: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss [Notes] [rergess distance from images, centernet like]
- GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving [Notes] CVPR 2019
- M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [Notes] ICCV 2019 (Xiaoming Liu)
- TLNet: Triangulation Learning Network: from Monocular to Stereo 3D Object Detection [Notes] CVPR 2019
- A Survey on 3D Object Detection Methods for Autonomous Driving Applications [Notes] TITS 2019 [Review]
- BEV-IPM: Deep Learning based Vehicle Position and Orientation Estimation via Inverse Perspective Mapping Image [Notes] IV 2019
- ForeSeE: Task-Aware Monocular Depth Estimation for 3D Object Detection [Notes] AAAI 2020 oral [successor to pseudo-lidar, mono 3DOD SOTA]
- Obj-dist: Learning Object-specific Distance from a Monocular Image [Notes] ICCV 2019 (xmotors.ai + NYU)
- DisNet: A novel method for distance estimation from monocular camera [Notes] IROS 2018
- BirdGAN: Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles [Notes] IROS 2019
- Shift R-CNN: Deep Monocular 3D Object Detection with Closed-Form Geometric Constraints [Notes] ICIP 2019
- 3D-RCNN: Instance-level 3D Object Reconstruction via Render-and-Compare [Notes] CVPR 2018
- Deep Optics for Monocular Depth Estimation and 3D Object Detection [Notes] ICCV 2019
- MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation [Notes] ICCV 2019
- Joint Monocular 3D Vehicle Detection and Tracking [Notes] ICCV 2019 (Berkeley DeepDrive)
- CasGeo: 3D Bounding Box Estimation for Autonomous Vehicles by Cascaded Geometric Constraints and Depurated 2D Detections Using 3D Results [Notes]
- Slimmable Neural Networks [Notes] ICLR 2019
- Universally Slimmable Networks and Improved Training Techniques [Notes] ICCV 2019
- AutoSlim: Towards One-Shot Architecture Search for Channel Numbers
- Once for All: Train One Network and Specialize it for Efficient Deployment
- DOTA: A Large-scale Dataset for Object Detection in Aerial Images [Notes] CVPR 2018 (rotated bbox)
- RoiTransformer: Learning RoI Transformer for Oriented Object Detection in Aerial Images [Notes] CVPR 2019 (rotated bbox)
- RRPN: Arbitrary-Oriented Scene Text Detection via Rotation Proposals TMM 2018
- R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection (rotated bbox)
- TI white paper: Webinar: mmWave Radar for Automotive and Industrial applications [Notes] [TI, radar]
- Federated Learning: Strategies for Improving Communication Efficiency [Notes] NIPS 2016
- sort: Simple Online and Realtime Tracking [Notes] ICIP 2016
- deep-sort: Simple Online and Realtime Tracking with a Deep Association Metric [Notes]
- MT-CNN: Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks [Notes] SPL 2016 (real time, facial landmark)
- RetinaFace: Single-stage Dense Face Localisation in the Wild [Notes] [joint object and landmark detection]
- SC-SfM-Learner: Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video [Notes] NIPS 2019
- SiamMask: Fast Online Object Tracking and Segmentation: A Unifying Approach CVPR 2019 (tracking, segmentation, label propagation)
- Review of Kálmán Filter (from Tim Babb, Pixar Animation) [Notes]
- R-FCN: Object Detection via Region-based Fully Convolutional Networks [Notes] NIPS 2016
- Guided backprop: Striving for Simplicity: The All Convolutional Net [Notes] ICLR 2015
- Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks [Notes] CVPR 2019
- Boxy Vehicle Detection in Large Images [Notes] ICCV 2019
- FQNet: Deep Fitting Degree Scoring Network for Monocular 3D Object Detection [Notes] CVPR 2019 (Mono 3DOD, Jiwen Lu)
- Mono3D: Monocular 3D Object Detection for Autonomous Driving [Notes] CVPR2016
- MonoDIS: Disentangling Monocular 3D Object Detection [Notes] ICCV 2019
- Pseudo lidar-e2e: Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud [Notes] ICCV 2019 (pseudo-lidar with 2d and 3d consistency loss, better than PL and worse than PL++, SOTA for pure mono3D)
- MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization [Notes] AAAI 2019 (SOTA of Mono3DOD, MLF < MonoGRNet < Pseudo-lidar)
- MLF: Multi-Level Fusion based 3D Object Detection from Monocular Images [Notes] CVPR 2018 (precursor to pseudo-lidar)
- ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape [Notes] CVPR 2019
- Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving [Notes] ICCV 2019 [similar to pseudo-lidar, color-enhanced]
- Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors [Notes] (from Stefano Soatto) AAAI 2019
- Deep Metadata Fusion for Traffic Light to Lane Assignment [Notes] IEEE RA-L 2019 (traffic lights association)
- Automatic Traffic Light to Ego Vehicle Lane Association at Complex Intersections ITSC 2019 (traffic lights association)
- Distant Vehicle Detection Using Radar and Vision[Notes] ICRA 2019 [radar, vision, radar tracklets fusion]
- Distance Estimation of Monocular Based on Vehicle Pose Information [Notes]
- Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics [Notes] CVPR 2018 (Alex Kendall)
- GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks [Notes] ICML 2018 (multitask)
- DTP: Dynamic Task Prioritization for Multitask Learning [Notes] ECCV 2018 [multitask, Stanford]
- Will this car change the lane? - Turn signal recognition in the frequency domain [Notes] IV 2014
- Complex-YOLO: Real-time 3D Object Detection on Point Clouds [Notes] (BEV detection only)
- Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds CVPR 2019 (sensor fusion and tracking)
- An intriguing failing of convolutional neural networks and the CoordConv solution [Notes] NIPS 2018
- Deep Parametric Continuous Convolutional Neural Networks [Notes] CVPR 2018 (@Uber, sensor fusion)
- ContFuse: Deep Continuous Fusion for Multi-Sensor 3D Object Detection [Notes] ECCV 2018 [Uber ATG, sensor fusion, BEV]
- Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net [Notes] CVPR 2018 oral [lidar only, perception and prediction]
- Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras [Notes] ICCV 2019 [monocular depth estimation, intrinsic estimation, SOTA]
- monodepth: Unsupervised Monocular Depth Estimation with Left-Right Consistency [Notes] CVPR 2017 oral (monocular depth estimation, stereo for training)
- Struct2depth: Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos [Notes] AAAI 2019 (monocular depth estimation, estimating movement of dynamic object)
- Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency [Notes] AAAI 2018 (monocular depth estimation, static assumption, surface normal)
- LEGO Learning Edge with Geometry all at Once by Watching Videos [Notes] CVPR 2018 spotlight (monocular depth estimation, static assumption, surface normal)
- Object Detection and 3D Estimation via an FMCW Radar Using a Fully Convolutional Network [Notes] (radar, RD map, OD, Arxiv 201902)
- A study on Radar Target Detection Based on Deep Neural Networks [Notes] (radar, RD map, OD)
- 2D Car Detection in Radar Data with PointNets [Notes] (from Ulm Univ, radar, point cloud, OD, Arxiv 201904)
- Learning Confidence for Out-of-Distribution Detection in Neural Networks [Notes] (budget to cheat)
- A Deep Learning Approach to Traffic Lights: Detection, Tracking, and Classification [Notes] ICRA 2017 (Bosch, traffic lights)
- How hard can it be? Estimating the difficulty of visual search in an image [Notes] CVPR 2016
- Deep Multi-modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges [Notes] (review from Bosch)
- Review of monocular 3d object detection (blog from 知乎)
- Deep3dBox: 3D Bounding Box Estimation Using Deep Learning and Geometry [Notes] CVPR 2017 [Zoox]
- MonoPSR: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction [Notes] CVPR 2019
- OFT: Orthographic Feature Transform for Monocular 3D Object Detection [Notes] BMVC 2019 [Convert camera to BEV, Alex Kendall]
- MixMatch: A Holistic Approach to Semi-Supervised Learning [Notes]
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [Notes] ICML 2019
- What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? [Notes] NIPS 2017
- Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding [Notes]BMVC 2017
- TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents [Notes] AAAI 2019 oral
- Deep Depth Completion of a Single RGB-D Image [Notes] CVPR 2018 (indoor)
- DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene from Sparse LiDAR Data and Single Color Image [Notes] CVPR 2019 (outdoor)
- SfMLearner: Unsupervised Learning of Depth and Ego-Motion from Video [Notes] CVPR 2017
- Monodepth2: Digging Into Self-Supervised Monocular Depth Estimation [Notes] ICCV 2019 [Niantic]
- DeepSignals: Predicting Intent of Drivers Through Visual Signals [Notes] ICRA 2019 (@Uber, turn signal detection)
- FCOS: Fully Convolutional One-Stage Object Detection [Notes] ICCV 2019 [Chunhua Shen]
- Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving [Notes] ICLR 2020
- MMF: Multi-Task Multi-Sensor Fusion for 3D Object Detection [Notes] CVPR 2019 (@Uber, sensor fusion)
- CenterNet: Objects as points (from ExtremeNet authors) [Notes]
- CenterNet: Object Detection with Keypoint Triplets [Notes]
- Object Detection based on Region Decomposition and Assembly [Notes] AAAI 2019
- The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks [Notes] ICLR 2019
- M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network [Notes] AAAI 2019
- Deep Radar Detector [Notes] RadarCon 2019
- Semantic Segmentation on Radar Point Clouds [[Notes]] (from Daimler AG) FUSION 2018
- Pruning Filters for Efficient ConvNets [Notes] ICLR 2017
- Layer-compensated Pruning for Resource-constrained Convolutional Neural Networks [Notes] NIPS 2018 talk
- LeGR: Filter Pruning via Learned Global Ranking [Notes]
- NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [Notes] CVPR 2019
- AutoAugment: Learning Augmentation Policies from Data [Notes] CVPR 2019
- Path Aggregation Network for Instance Segmentation [Notes] CVPR 2018
- Channel Pruning for Accelerating Very Deep Neural Networks ICCV 2017 (Face++, Yihui He) [Notes]
- AMC: AutoML for Model Compression and Acceleration on Mobile Devices ECCV 2018 (Song Han, Yihui He)
- MobileNetV3: Searching for MobileNetV3 [Notes]
- MnasNet: Platform-Aware Neural Architecture Search for Mobile [Notes] CVPR 2019
- Rethinking the Value of Network Pruning ICLR 2019
- MobileNetV2: Inverted Residuals and Linear Bottlenecks (MobileNets v2) [Notes] CVPR 2018
- A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms [Notes] ITSC 2013
- MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving [Notes]
- Optimizing the Trade-off between Single-Stage and Two-Stage Object Detectors using Image Difficulty Prediction (Very nice illustration of 1 and 2 stage object detection)
- Light-Head R-CNN: In Defense of Two-Stage Object Detector [Notes] (from Megvii)
- CSP: High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection (center and scale prediction) [Notes] CVPR 2019
- Review of Anchor-free methods (知乎Blog) 目标检测:Anchor-Free时代 Anchor free深度学习的目标检测方法 My Slides on CSP
- DenseBox: Unifying Landmark Localization with End to End Object Detection
- CornerNet: Detecting Objects as Paired Keypoints [Notes] ECCV 2018
- ExtremeNet: Bottom-up Object Detection by Grouping Extreme and Center Points [Notes] CVPR 2019
- FSAF: Feature Selective Anchor-Free Module for Single-Shot Object Detection [Notes] CVPR 2019
- FoveaBox: Beyond Anchor-based Object Detector (anchor-free) [Notes]
- Bag of Freebies for Training Object Detection Neural Networks [Notes]
- mixup: Beyond Empirical Risk Minimization [Notes] ICLR 2018
- Multi-view Convolutional Neural Networks for 3D Shape Recognition (MVCNN) [Notes] ICCV 2015
- 3D ShapeNets: A Deep Representation for Volumetric Shapes [Notes] CVPR 2015
- Volumetric and Multi-View CNNs for Object Classification on 3D Data [Notes] CVPR 2016
- Group Normalization [Notes] ECCV 2018
- Spatial Transformer Networks [Notes] NIPS 2015
- Frustum PointNets for 3D Object Detection from RGB-D Data (F-PointNet) [Notes] CVPR 2018
- Dynamic Graph CNN for Learning on Point Clouds [Notes]
- PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud (SOTA for 3D object detection) [Notes] CVPR 2019
- Multi-View 3D Object Detection Network for Autonomous Driving (MV3D) [Notes] CVPR 2017 (Baidu, sensor fusion, BV proposal)
- Joint 3D Proposal Generation and Object Detection from View Aggregation (AVOD) [Notes] IROS 2018 (sensor fusion, multiview proposal)
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [Notes]
- Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving [Notes] CVPR 2019
- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection CVPR 2018 (Apple, first end-to-end point cloud encoding to grid)
- SECOND: Sparsely Embedded Convolutional Detection Sensors 2018 (builds on VoxelNet)
- PointPillars: Fast Encoders for Object Detection from Point Clouds [Notes] CVPR 2019 (builds on SECOND)
- Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite [Notes] CVPR 2012
- Vision meets Robotics: The KITTI Dataset [Notes] IJRR 2013
- Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (I3D) [Notes]Video CVPR 2017
- Initialization Strategies of Spatio-Temporal Convolutional Neural Networks [Notes] Video
- Detect-and-Track: Efficient Pose Estimation in Videos [Notes] ICCV 2017 Video
- Deep Learning Based Rib Centerline Extraction and Labeling [Notes] MI MICCAI 2018
- SlowFast Networks for Video Recognition [Notes] ICCV 2019 Oral
- Aggregated Residual Transformations for Deep Neural Networks (ResNeXt) [Notes] CVPR 2017
- Beyond the pixel plane: sensing and learning in 3D (blog, 中文版本)
- VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition (VoxNet) [Notes]
- PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation CVPR 2017 [Notes]
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space NIPS 2017 [Notes]
- Review of Geometric deep learning 几何深度学习前沿 (from 知乎) (Up to CVPR 2018)
- Human-level control through deep reinforcement learning (Nature DQN paper) [Notes] DRL
- Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection [Notes] MI
- Panoptic Segmentation [Notes] PanSeg
- Panoptic Feature Pyramid Networks [Notes] PanSeg
- Attention-guided Unified Network for Panoptic Segmentation [Notes] PanSeg
- Bag of Tricks for Image Classification with Convolutional Neural Networks [Notes] CLS
- Deep Reinforcement Learning for Vessel Centerline Tracing in Multi-modality 3D Volumes [Notes] DRL MI
- Deep Reinforcement Learning for Flappy Bird [Notes] DRL
- Long-Term Feature Banks for Detailed Video Understanding [Notes] Video
- Non-local Neural Networks [Notes] Video CVPR 2018
- Mask R-CNN
- Cascade R-CNN: Delving into High Quality Object Detection
- Focal Loss for Dense Object Detection (RetinaNet) [Notes]
- Squeeze-and-Excitation Networks (SENet)
- Progressive Growing of GANs for Improved Quality, Stability, and Variation
- Deformable Convolutional Networks (build on R-FCN)
- Learning Region Features for Object Detection
- Learning notes on Deep Learning
- List of Papers on Machine Learning
- Notes of Literature Review on CNN in CV This is the notes for all the papers in the recommended list here
- Notes of Literature Review (Others)
- Notes on how to set up DL/ML environment
- Useful setup notes
Here is the list of papers waiting to be read.
- SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
- ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness ICML 2019
- Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet (BagNet) blog ICML 2019
- A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay
- Understanding deep learning requires rethinking generalization
- Mask Scoring R-CNN CVPR 2019
- Training Region-based Object Detectors with Online Hard Example Mining
- Gliding vertex on the horizontal bounding box for multi-oriented object detection
- ONCE: Incremental Few-Shot Object Detection CVPR 2020
- Domain Adaptive Faster R-CNN for Object Detection in the Wild CVPR 2018
- Foggy Cityscapes: Semantic Foggy Scene Understanding with Synthetic Data IJCV 2018
- Foggy Cityscapes ECCV: Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding ECCV 2018
- Dropout Sampling for Robust Object Detection in Open-Set Conditions ICRA 2018 (Niko Sünderhauf)
- Hybrid Task Cascade for Instance Segmentation CVPR 2019 (cascaded mask RCNN)
- Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection ICRA 2019 (Niko Sünderhauf)
- A Unified Panoptic Segmentation Network CVPR 2019 PanSeg
- Model Vulnerability to Distributional Shifts over Image Transformation Sets (CVPR workshop) tl:dr
- Automatic adaptation of object detectors to new domains using self-training CVPR 2019 (find corner case and boost)
- Missing Labels in Object Detection CVPR 2019
- DenseBox: Unifying Landmark Localization with End to End Object Detection
- Circular Object Detection in Polar Coordinates for 2D LIDAR Data CCPR 2016
- Learning Spatiotemporal Features with 3D Convolutional Networks (C3D) Video ICCV 2015
- AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
- Spatiotemporal Residual Networks for Video Action Recognition (decouple spatiotemporal) NIPS 2016
- Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks (P3D, decouple spatiotemporal) ICCV 2017
- A Closer Look at Spatiotemporal Convolutions for Action Recognition (decouple spatiotemporal) CVPR 2018
- Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification (decouple spatiotemporal) ECCV 2018
- Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? CVPR 2018
- Efficient Deep Learning Inference based on Model Compression (Model Compression)
- Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks
- CBAM: Convolutional Block Attention Module
- Playing Atari with Deep Reinforcement Learning NIPS 2013
- Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scan
- An Artificial Agent for Robust Image Registration
- 3D-CNN:3D Convolutional Neural Networks for Landing Zone Detection from LiDAR
- Generative and Discriminative Voxel Modeling with Convolutional Neural Networks
- Orientation-boosted Voxel Nets for 3D Object Recognition (ORION) <BMVC 2017>
- GIFT: A Real-time and Scalable 3D Shape Search Engine CVPR 2016
- 3D Shape Segmentation with Projective Convolutional Networks (ShapePFCN)CVPR 2017
- Learning Local Shape Descriptors from Part Correspondences With Multi-view Convolutional Networks
- Open3D: A Modern Library for 3D Data Processing
- Multimodal Deep Learning for Robust RGB-D Object Recognition IROS 2015
- FlowNet3D: Learning Scene Flow in 3D Point Clouds CVPR 2019
- Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling CVPR 2018 (Neighbors Do Help: Deeply Exploiting Local Structures of Point Clouds)
- PU-Net: Point Cloud Upsampling Network CVPR 2018
- Recurrent Slice Networks for 3D Segmentation of Point Clouds CVPR 2018
- SPLATNet: Sparse Lattice Networks for Point Cloud Processing CVPR 2018
- Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering NIPS 2016
- Semi-Supervised Classification with Graph Convolutional Networks ICLR 2017
- Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks NIPS 2017
- Graph Attention Networks ICLR 2018
- 3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection (3D SSD)
- Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models ICCV 2017
- Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis CVPR 2017
- IPOD: Intensive Point-based Object Detector for Point Cloud
- Amodal Detection of 3D Objects: Inferring 3D Bounding Boxes from 2D Ones in RGB-Depth Images CVPR 2017
- 2D-Driven 3D Object Detection in RGB-D Images
- 3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection
- PSMNet: Pyramid Stereo Matching Network CVPR 2018
- Stereo R-CNN based 3D Object Detection for Autonomous Driving CVPR 2019
- Deep Rigid Instance Scene Flow CVPR 2019
- The DriveU Traffic Light Dataset: Introduction and Comparison with Existing Datasets ICRA 2018
- The Oxford Radar RobotCar Dataset: A Radar Extension to the Oxford RobotCar Dataset
- Vision for Looking at Traffic Lights: Issues, Survey, and Perspectives (traffic light survey, UCSD LISA)
- Review of Graph Spectrum Theory (WIP)
- 3D Deep Learning Tutorial at CVPR 2017 [Notes] - (WIP)
- A Survey on Neural Architecture Search
- Network pruning tutorial (blog)
- GNN tutorial at CVPR 2019
- nuScenes: A multimodal dataset for autonomous driving (dataset, point cloud, radar)
- Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation 3DV 2018
- Depth Map Prediction from a Single Image using a Multi-Scale Deep Network NIPS 2014 (Eigen et al)
- Learning Depth from Monocular Videos using Direct Methods CVPR 2018 (monocular depth estimation)
- Virtual-Normal: Enforcing geometric constraints of virtual normal for depth prediction [Notes] ICCV 2019 (better generation of PL)
- Self-supervised Learning with Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera ICCV 2019
- Spatial Correspondence with Generative Adversarial Network: Learning Depth from Monocular Videos ICCV 2019
- Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM ICCV 2019
- Visualization of Convolutional Neural Networks for Monocular Depth Estimation ICCV 2019
- Self-Supervised Monocular Depth Hints ICCV 2019
- PIXOR: Real-time 3D Object Detection from Point Clouds CVPR 2018 (birds eye view)
- PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation (pointnet alternative, backbone)
- Vehicle Detection from 3D Lidar Using Fully Convolutional Network (VeloFCN) RSS 2016
- KPConv: Flexible and Deformable Convolution for Point Clouds (from the authors of PointNet)
- PointCNN: Convolution On X-Transformed Points NIPS 2018
- L3-Net: Towards Learning based LiDAR Localization for Autonomous Driving CVPR 2019
- RoarNet: A Robust 3D Object Detection based on RegiOn Approximation Refinement (sensor fusion, 3D mono proposal, refined in point cloud)
- DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map CVPR 2018
- Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection IROS 2019
- PointRNN: Point Recurrent Neural Network for Moving Point Cloud Processing
- Gated2Depth: Real-time Dense Lidar from Gated Images ICCV 2019 oral
- A Multi-Sensor Fusion System for Moving Object Detection and Tracking in Urban Driving Environments ICRA 2014
- PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation CVPR 2018 (sensor fusion)
- Deep Hough Voting for 3D Object Detection in Point Clouds (from Charles Qi)
- GeoNet: Deep Geodesic Networks for Point Cloud Analysis CVPR 2019 (oral, Megvii)
- StixelNet: A Deep Convolutional Network for Obstacle Detection and Road Segmentation
- Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty CVPR 2018 [on-board bbox prediction]
- Unsupervised Traffic Accident Detection in First-Person Videos IROS 2019 (Honda)
- NEMO: Future Object Localization Using Noisy Ego Priors (Honda)
- Robust Aleatoric Modeling for Future Vehicle Localization (perspective)
- Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments WACV 2020 (perspective bbox, pedestrian)
- Using panoramic videos for multi-person localization and tracking in a 3D panoramic coordinate
- End-to-end Lane Detection through Differentiable Least-Squares Fitting ICCV 2019
- Detecting Lane and Road Markings at A Distance with Perspective Transformer Layers (3D LLD)
- Ultra Fast Structure-aware Deep Lane Detection [lane detection]
- A Novel Approach for Detecting Road Based on Two-Stream Fusion Fully Convolutional Network (convert camera to BEV)
- FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network
- RetinaTrack: Online Single Stage Joint Detection and Tracking CVPR 2020
- Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art (latest update in Dec 2019)
- Simultaneous Identification and Tracking of Multiple People Using Video and IMUs CVPR 2019
- Detect-and-Track: Efficient Pose Estimation in Videos
- TrackNet: Simultaneous Object Detection and Tracking and Its Application in Traffic Video Analysis
- Video Action Transformer Network CVPR 2019 oral
- Online Real-time Multiple Spatiotemporal Action Localisation and Prediction ICCV 2017
- 多目标跟踪 近年论文及开源代码汇总
- TSM: Temporal Shift Module for Efficient Video Understanding ICCV 2019 (Song Han)
- AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation ICCV 2019
- One-Shot Video Object Segmentation CVPR 2017
- SpeedNet: Learning the Speediness in Videos CVPR 2020 oral
- Looking Fast and Slow: Memory-Guided Mobile Video Object Detection CVPR 2018
- Towards High Performance Video Object Detection [Notes] CVPR 2018
- Towards High Performance Video Object Detection for Mobiles [Notes]
- Exploiting temporal consistency for real-time video depth estimation ICCV 2019
- PifPaf: Composite Fields for Human Pose Estimation CVPR 2019
- Probabilistic Face Embeddings ICCV 2019
- Data Uncertainty Learning in Face Recognition CVPR 2020
- Revisiting Small Batch Training for Deep Neural Networks
- ICML2019 workshop: Adaptive and Multitask Learning: Algorithms & Systems ICML 2019
- Adaptive Scheduling for Multi-Task Learning NIPS 2018 (NMT)
- Polar Transformer Networks ICLR 2018
- Measuring Calibration in Deep Learning CVPR 2019
- Sampling-free Epistemic Uncertainty Estimation Using Approximated Variance Propagation ICCV 2019 (epistemic uncertainty)
- Making Convolutional Networks Shift-Invariant Again ICML
- ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks ICCV 2019
- Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty NeurIPS 2019
- Understanding deep learning requires rethinking generalization ICLR 2017 [ICLR best paper]
- A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks ICLR 2017 (NLL score as anomaly score)
- Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination CVPR 2018 spotlight (Stella Yu)
- Theoretical insights into the optimization landscape of over-parameterized shallow neural networks TIP 2018
- The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning ICML 2018
- Designing Network Design Spaces CVPR 2020
- Moco2: Improved Baselines with Momentum Contrastive Learning
- SGD on Neural Networks Learns Functions of Increasing Complexity NIPS 2019 (SGD learns a linear classifier first)
- Pay attention to the activations: a modular attention mechanism for fine-grained image recognition
- A Mixed Classification-Regression Framework for 3D Pose Estimation from 2D Images BMVC 2018 (multi-bin, what's new?)
- In-Place Activated BatchNorm for Memory-Optimized Training of DNNs CVPR 2018 (optimized BatchNorm + ReLU)
- FCNN: Fourier Convolutional Neural Networks (FFT as CNN)
- Visualizing the Loss Landscape of Neural Nets NIPS 2018
- Xception: Deep Learning with Depthwise Separable Convolutions (Xception)
- Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics (uncertainty)
- Learning to Drive from Simulation without Real World Labels ICRA 2019 (domain adaptation, sim2real)
- 3DOP: 3D Object Proposals for Accurate Object Class Detection NIPS 2015
- DirectShape: Photometric Alignment of Shape Priors for Visual Vehicle Pose and Shape Estimation
- Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery ECCV 2018 (Monocular 3D object detection and depth estimation)
- Towards Scene Understanding: Unsupervised Monocular Depth Estimation with Semantic-aware Representation CVPR 2019
- DDP: Dense Depth Posterior from Single Image and Sparse Range CVPR 2019
- Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes IJCV 2018 (data augmentation with AR, Toyota)
- Exploring the Capabilities and Limits of 3D Monocular Object Detection -- A Study on Simulation and Real World Data IITS
- GCNet: End-to-End Learning of Geometry and Context for Deep Stereo Regression ICCV 2017 (disparity estimation)
- PSMNet: Pyramid Stereo Matching Network CVPR 2018 (disparity estimation)
- Towards Scene Understanding with Detailed 3D Object Representations IJCV 2014 (keypoint, 3D bbox annotation)
- Deep Cuboid Detection: Beyond 2D Bounding Boxes (Magic Leap)
- Viewpoints and Keypoints (Malik)
- Lifting Object Detection Datasets into 3D (PASCAL)
- 3D Object Class Detection in the Wild (keypoint based)
- Fast Single Shot Detection and Pose Estimation 3DV 2016 (SSD + pose, Wei Liu)
- Virtual KITTI 2
- Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing CVPR 2017
- Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views ICCV 2015 Oral
- End-to-End Learning of Geometry and Context for Deep Stereo Regression ICCV 2017
- Real-Time Seamless Single Shot 6D Object Pose Prediction CVPR 2018
- Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching NIPS 2018 (disparity estimation)
- Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera ICRA 2019/kbd>
- Learning Depth with Convolutional Spatial Propagation Network (Baidu, depth from SPN) ECCV 2018
- Classification of Objects in Polarimetric Radar Images Using CNNs at 77 GHz (Radar, polar) <-- todo
- CNNs for Interference Mitigation and Denoising in Automotive Radar Using Real-World Data NeurIPS 2019 (radar)
- Road Scene Understanding by Occupancy Grid Learning from Sparse Radar Clusters using Semantic Segmentation ICCV 2019 (radar)
- PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization [Notes] ICCV 2015
- PoseNet2: Modelling Uncertainty in Deep Learning for Camera Relocalization ICRA 2016
- PoseNet3: Geometric Loss Functions for Camera Pose Regression with Deep Learning CVPR 2017
- EssNet: Convolutional neural network architecture for geometric matching CVPR 2017
- NC-EssNet: Neighbourhood Consensus Networks NeurIPS 2018
- Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task CVPR 2020 oral [Eric Brachmann, ngransac]
- Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints CVPR 2018
- Uncertainty Guided Multi-Scale Residual Learning-using a Cycle Spinning CNN for Single Image De-Raining CVPR 2019
- Learn to Combine Modalities in Multimodal Deep Learning (sensor fusion, general DL)
- Safe Trajectory Generation For Complex Urban Environments Using Spatio-temporal Semantic Corridor LRA 2019 [Motion planning]
- DAgger: Driving Policy Transfer via Modularity and Abstraction CoRL 2018 [DAgger, Immitation Learning]
- Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization IROS 2019 oral [Uber ATG, metadata, localization]
- Efficient Uncertainty-aware Decision-making for Automated Driving Using Guided Branching ICRA 2020 [Motion planning]
- Baidu Apollo EM Motion Planner
- Calibration of Heterogeneous Sensor Systems
- Intro:Sensor Fusion for Adas 无人驾驶中的数据融合 (from 知乎) (Up to CVPR 2018)
- YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving CVPR 2019 (Real Time, Low Power)
- Deep Fusion of Heterogeneous Sensor Modalities for the Advancements of ADAS to Autonomous Vehicles
- Ad推荐系统方向文章汇总
- UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction [Notes] (dimension reduction, better than t-SNE)
- Review Notes of Classical Key Points and Descriptors
- CRF
- Visual SLAM and Visual Odometry
- ORB SLAM
- Bundle Adjustment
- 3D vision
- SLAM/VIO学习总结