Skip to content

About GPU memory usage  #45

Open
Open
@zhongsanqiang

Description

@zhongsanqiang

Dear author,

I'm trying to run LongLM on a single A10 with 24G memory, I have tried 'meta-llama/Llama-2-7b-chat-hf' and failed with out of CUDA memory error(attached).
20240726105026

I realized that your example.py is running on 4 RTX3090s, 24GB memory each. So I wonder whether an A10 is something worth a shot or not even close?

I also want to ask whether compressed models, for example Unsloth's model, can be used in LongLM or not?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions