EasyDeL Version 0.0.55
EasyDeL Version 0.0.55
- JAX
DPOTrainer
Bugs Fixed - StableLM Models are supported with FlashAttention and RING-Attention
- RingAttention is supported for Up to 512K or 1M token training and inference
- chunk MLP Is Supported for Up to 512K or 1M token training and inference
- now all the Models support shared key and value caching for high context length interface and can be accessed via
use_sharded_kv_caching=True
in model config (see examples). - EasyDeL successfully passed 1256000 Context Length Inference on TPUs (Llama Model Tested)
- Vision Trainer is added, you might except some bugs from that.
Full Changelog: 0.0.50...0.0.55