This folder contains the following examples for Llama 2 models: `
File | Description | Model Used | GPU Minimum Requirement |
---|---|---|---|
01_load_inference | Environment setup and suggested configurations when inferencing Llama 2 models on Databricks. | Llama-2-7b-chat-hf |
1xA10-24GB |
02_mlflow_logging_inference | Save, register, and load Llama 2 models with MLflow, and create a Databricks model serving endpoint. | Llama-2-7b-chat-hf |
1xA10-24GB |
02_[chat]_mlflow_logging_inference | Save, register, and load Llama 2 models with MLflow, and create a Databricks model serving endpoint for chat completion. | Llama-2-7b-chat-hf |
1xA10-24GB |
03_serve_driver_proxy | Serve Llama 2 models on the cluster driver node using Flask. | Llama-2-7b-chat-hf |
1xA10-24GB |
03_[chat]_serve_driver_proxy | Serve Llama 2 models as chat completion on the cluster driver node using Flask. | Llama-2-7b-chat-hf |
1xA10-24GB |
04_langchain | Integrate a serving endpoint or cluster driver proxy app with LangChain and query. | N/A | N/A |
04_[chat]_langchain | Integrate a serving endpoint and setup langchain chat model. | N/A | N/A |
05_fine_tune_deepspeed | Fine-tune Llama 2 base models leveraging DeepSpeed. | Llama-2-7b-hf |
4xA10 or 2xA100-80GB |
06_fine_tune_qlora | Fine-tune Llama 2 base models with QLORA. | Llama-2-7b-hf |
1xA10-24GB |
07_ai_gateway | Manage a MLflow AI Gateway Route that accesses a Databricks model serving endpoint. | N/A | N/A |
08_load_from_marketplace | Load Llama 2 models from Databricks Marketplace. | Llama-2-7b-chat-hf |
1xA10-24GB |