Skip to content

Commit

Permalink
Merge pull request #1 from KushalBKusram/waymo_2.0/download_data
Browse files Browse the repository at this point in the history
2.0 | Download Data
  • Loading branch information
KushalBKusram authored Mar 4, 2024
2 parents a0bf009 + a60e60a commit 6e935cd
Show file tree
Hide file tree
Showing 10 changed files with 44 additions and 57 deletions.
32 changes: 11 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,23 @@
# Waymo Open Dataset Toolkit

## Description
A set of functions to extract and visualize Waymo Open Dataset.

## Features
- Extract images per frame per segment with corresponding labels
- Extract images per camera with corresponding labels
- Extracted images are stored as png
- Extracted labels are in the format: object-class x y width height
- Extract LiDAR data as point clouds with camera projections
- Visualize LiDAR data as point cloud

## Screenshots
## Getting Started

### Camera Data
![Camera Data](images/camera.png)
To get started with Waymo Open Dataset, ensure you have gained access to the dataset using your Google account. Proceed only after you are able to view the dataset on the Google Cloud Console [here](https://console.cloud.google.com/storage/browser/waymo_open_dataset_v_2_0_0).

### Video
![Video](images/camera.gif)
## Install Gcloud
- Follow the instructions on this [page](https://cloud.google.com/sdk/docs/install) to install the gcloud CLI.
- Authenticate with your account via the CLI by following this [link](https://cloud.google.com/docs/authentication/provide-credentials-adc#local-dev). This ultimately should create a credentials file and stored on your development machine. These credentials will be utilized by the script to download the data.

### Point Cloud Data
![Point Cloud Data](images/lidar.gif)
## Download Data
- Assuming you have authenticated, creadentials are generated and accessible across applications on your development machine; run the following script:
`./scripts/download_data.sh <source-blob> <destination-folder> <-m : for parallelization>`.
- For example, if you wish to download just `camera_image` then the command looks like this: `./scripts/download_data.sh waymo_open_dataset_v_2_0_0/training/camera_image /mnt/e/WaymoOpenDatasetV2/training/camera_image -m`
- If you wish to download the entire dataset then it is roughly `2.29TB`. You may query with `gsutil du -s -ah gs://waymo_open_dataset_v_2_0_0` if there has been any change to the dataset.

## Requirements
Linux, Python, Waymo Open Dataset, OpenCV, Open3D

## Usage
Repo consists short code implemented in [src/main.py](src/main.py) to demo data extraction process, video creation, object count consolidation and [src/visualize.py](src/visualize.py) to visualize camera and LiDAR data.
Make sure you have `gsutil` set up correctly on your machine before trying to retrieve data.
## Analyze Data

## License
Licensed under [GNU AGPL v3](https://github.com/KushalBKusram/WaymoDataToolkit/blob/master/LICENSE).
Expand Down
Binary file removed images/camera.gif
Binary file not shown.
Binary file removed images/camera.png
Binary file not shown.
Binary file removed images/lidar.gif
Binary file not shown.
2 changes: 2 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
if __name__=="__main__":
print("Work In Progress")
File renamed without changes.
File renamed without changes.
5 changes: 1 addition & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1 @@
numpy
opencv-python
tensorflow
open3d
google-cloud-storage
30 changes: 30 additions & 0 deletions scripts/download_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash

# Function to display usage information
function usage {
echo "Usage: $0 bucket_name destination_folder [-m]"
echo "Options:"
echo " -m Enable parallel (multi-threaded/multi-processing) operations"
exit 1
}

# Check if the number of arguments is valid
if [ $# -lt 2 ] || [ $# -gt 3 ]; then
usage
fi

# Parse command-line arguments
bucket_name="$1"
destination_folder="$2"
parallel_flag="$3"

# Check if parallel flag is provided and set the gsutil command accordingly
if [ "$parallel_flag" == "-m" ]; then
gsutil_cmd="gsutil -m cp -r gs://$bucket_name/* $destination_folder"
else
gsutil_cmd="gsutil cp -r gs://$bucket_name/* $destination_folder"
fi

# Execute the gsutil command
echo "Downloading contents of bucket $bucket_name to $destination_folder..."
eval $gsutil_cmd
32 changes: 0 additions & 32 deletions src/main.py

This file was deleted.

0 comments on commit 6e935cd

Please sign in to comment.