Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Open-vocabulary object detection] out of memory error #2

Open
hskAlena opened this issue Feb 24, 2025 · 1 comment
Open

[Open-vocabulary object detection] out of memory error #2

hskAlena opened this issue Feb 24, 2025 · 1 comment

Comments

@hskAlena
Copy link

Hello! Thanks for your great work!

I'm trying to reproduce open-vocabulary object detection on Lerf_ovs dataset.
I've finished running bash script/lerf_uplift.sh figurines, however I got out-of-memory error with bash script/lerf_eval.sh figurines lerf_eval_sam.

Now I'm using A100-PCIE-40GB GPU, which is smaller than your 48GB GPU.

Is there any way that I can reduce memory requirements while using lerf_eval_sam and graph diffusion?

The out-of-memory error message is as follows:

--------------- Evaluating with graph diffusion ---------------
Number of positive nodes per prompt at graph initialization: [124040, 601, 1338, 11127, 601, 601, 601, 124267, 123983, 601, 123993, 601, 601, 601, 641, 601, 601, 8581, 124580, 601, 601]
Generating edges from point cloud...
600000/600000 | Elapsed: 00:40s | Querying 200 euclidean neighbors for each of the 600000 Gaussians.
Traceback (most recent call last):
  File "/home/hskim/projects/ludvig/ludvig_clip.py", line 434, in <module>
    model.evaluate()
  File "/home/hskim/projects/ludvig/ludvig_clip.py", line 394, in evaluate
    self.run_diffusion()
  File "/home/hskim/projects/ludvig/ludvig_clip.py", line 98, in run_diffusion
    self.relev = self.graph_diffusion().T
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hskim/projects/ludvig/diffusion/clip.py", line 31, in __call__
    self.precompute_similarities(dinov2_features)
  File "/home/hskim/projects/ludvig/diffusion/base.py", line 75, in precompute_similarities
    features[:, None] - features[self.knn_neighbor_indices], dim=-1
    ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 17.97 GiB. GPU 0 has a total capacity of 39.39 GiB of which 16.78 GiB is free. Including non-PyTorch memory, this process has 22.59 GiB memory in use. Of the allocated memory 21.14 GiB is allocated by PyTorch, and 942.68 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Thank you!

@JulietteMarrie
Copy link
Contributor

Hi,

Thank you for your interest in our work! You can reduce the number of neighbors (num_neighbors: 200) in the configuration file configs/lerf_eval_sam.yaml. Try setting it to, e.g., 180 or 160 (the higher, the better).

Feel free to reach out if you encounter any other issues or have any questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants