A Basic GFlowNet Setup
gflownets
implements a GFlowNet using the trajectory balance loss on the Smiley Environment.
Includes some basic goodies that were recommended at the GFlowNet Workshop:
- Off-policy training via dithering (tempering + eps-greedy)
- Gradient clipping
- Monitoring
- Pf/Pb Entropy
- ||Gradient||
- Rewards (avg/max)
See torchgfn and gflownet for mature libraries.
git clone git@github.com:cgreer/gflownets.git
python3 -m venv gfn
Activate virtual environment:
source gfn/bin/activate
Install requirements:
pip install -r requirements.txt
python train_smiley.py
After training completes it will run the evaluation analysis and show the training dashboard:
If training ran correctly, then smiley faces should be sampled proportional to their reward (~66% smiley) and the estimate for Z should be ~12.