Skip to content

Commit

Permalink
Merge pull request #1225 from bghira/bugfix/flux-multinode-rank-detec…
Browse files Browse the repository at this point in the history
…tion

flux: use rank 0 for h100 detection since that is the most realistic setup
  • Loading branch information
bghira authored Dec 18, 2024
2 parents fb58cef + 90f4ba4 commit e9c9ef0
Showing 1 changed file with 2 additions and 5 deletions.
7 changes: 2 additions & 5 deletions helpers/models/flux/transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,11 +210,8 @@ def __init__(self, dim, num_attention_heads, attention_head_dim, mlp_ratio=4.0):

processor = FluxAttnProcessor2_0()
if torch.cuda.is_available():
rank = (
torch.distributed.get_rank()
if torch.distributed.is_initialized()
else 0
)
# let's assume that the box only ever has H100s.
rank = 0
primary_device = torch.cuda.get_device_properties(rank)
if primary_device.major == 9 and primary_device.minor == 0:
if is_flash_attn_available:
Expand Down

0 comments on commit e9c9ef0

Please sign in to comment.