Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic migration of TRTLLM runtime configuration #1279

Merged
merged 4 commits into from
Dec 11, 2024

Conversation

joostinyi
Copy link
Collaborator

@joostinyi joostinyi commented Dec 11, 2024

🚀 What

  • This PR improves the breaking change rollout experience for users that have previously configured runtime configs in the trt_llm.build key.

For example the following config:

trt_llm:
  build:
    base_model: llama
    checkpoint_repository:
      repo: meta-llama/Llama-3.1-8B-Instruct
      source: HF
    max_seq_len: 8192
    enable_chunked_context: False
    kv_cache_free_gpu_mem_fraction: 0.6

will have runtime configs migrated to the proper runtime key:

trt_llm:
  build:
    base_model: llama
    checkpoint_repository:
      repo: meta-llama/Llama-3.1-8B-Instruct
      source: HF
    max_seq_len: 8192
  runtime:
    enable_chunked_context: False
    kv_cache_free_gpu_mem_fraction: 0.6

with the following CLI logging:

Found extra fields ['enable_chunked_context', 'kv_cache_free_gpu_mem_fraction'] in build configuration, attempting to migrate valid fields to runtime configuration. This migration of deprecated fields is scheduled for        
removal, please upgrade to the latest truss version and update configs according to https://docs.baseten.co/performance/engine-builder-config.                                                                                   
Setting runtime.enable_chunked_context: False                                                                                                                                                                                    
Setting runtime.kv_cache_free_gpu_mem_fraction: 0.6

💻 How

  • During config initialization, we look for extra build fields and attempt to patch matching runtime configuration with a deprecation notice and callout to update the truss client version.
  • We will rollout this config change to the backend with 0.9.56rc2 and follow up with a full patch release of the client package.

🔬 Testing

  • Added unit tests

@joostinyi joostinyi merged commit 90169ed into main Dec 11, 2024
4 checks passed
@joostinyi joostinyi deleted the jyi/defensive-trt-llm-config-migration branch December 11, 2024 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants