Skip to content

Commit

Permalink
stop slandering H100s (#992)
Browse files Browse the repository at this point in the history
Messed up my SI prefixes -- peta is quadrillion, not trillion
  • Loading branch information
charlesfrye authored Nov 22, 2024
1 parent 81b0b47 commit 3127725
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion 06_gpu_and_ml/llm-serving/trtllm_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ def download_model():
# NVIDIA's Ada Lovelace/Hopper chips, like the 4090, L40S, and H100,
# are capable of native calculations in 8bit floating point numbers, so we choose that as our quantization format (`qformat`).
# These GPUs are capable of twice as many floating point operations per second in 8bit as in 16bit --
# about a trillion per second on an H100.
# about two quadrillion per second on an H100 SXM.

N_GPUS = 1 # Heads up: this example has not yet been tested with multiple GPUs
GPU_CONFIG = modal.gpu.H100(count=N_GPUS)
Expand Down

0 comments on commit 3127725

Please sign in to comment.