Skip to content

Commit

Permalink
Update SANA.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bghira authored Dec 4, 2024
1 parent 1ea2858 commit 783a3a9
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion documentation/quickstart/SANA.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ Sana is a strange architecture relative to other models that are trainable by Si
- SageAttention does not work with Sana due to the shapes inside the model
- The loss value when training Sana is very high, and it might need a much lower learning rate than other models (eg. `1e-5` or thereabouts)

Gradient checkpointing can free VRAM, but slows down training. A chart of test results from a 4090 with 5800X3D:

![image](https://github.com/user-attachments/assets/310bf099-a077-4378-acf4-f60b4b82fdc4)

### Prerequisites

Make sure that you have python installed; SimpleTuner does well with 3.10 or 3.11. **Python 3.12 should not be used**.
Expand Down Expand Up @@ -362,4 +366,4 @@ If any image quality issues arise, please open an issue on Github.

### Aspect bucketing

This model has an unknown response to aspect bucketed data. Experimentation will be helpful.
This model has an unknown response to aspect bucketed data. Experimentation will be helpful.

0 comments on commit 783a3a9

Please sign in to comment.