Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about CUDA memory requirements to run code #7

Open
frostfox661 opened this issue Oct 30, 2024 · 3 comments
Open

Question about CUDA memory requirements to run code #7

frostfox661 opened this issue Oct 30, 2024 · 3 comments

Comments

@frostfox661
Copy link

When I run the file "folio-direct-llm.py" using single 4090 and llama-7b, I often get CUDA memory overflow issues. After adding the code to clear the CUDA cache for each case loop, and monitoring the storage situation, it was found that the CUDA memory will increase cumulatively at the specific node. What is going on? What is the equipment environment required to run this project?
8eda7a3fc73b09acbff41b054a06905

@yifanzhang-pro
Copy link
Member

yifanzhang-pro commented Oct 30, 2024

The guidance library might have some caching mechanism for multiple queries of the same context, we suggest you to run it on A100-80GB.

@frostfox661
Copy link
Author

Thank you for your reply.
What are the special advantages of using the guidance library in this project? Can other libraries be used as an alternative?

@yifanzhang-pro
Copy link
Member

You may try using alternative libraries, though the prompt may need adjustment for compatibility with different models and libraries, and the results may vary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants