Releases · facebookresearch/fairseq2

See changelog.

What's Changed

Bump to v0.2.0+devel by @cbalioglu in #23
Fix CI bugs by @cbalioglu in #24
Nit markdown improvements by @cbalioglu in #25
Llama implementation by @light1726 in #19
Refactor SwiGLU FFN to GLU FFN by @cbalioglu in #26
Refactor upgrade_fairseq_checkpoint by @cbalioglu in #27
Add standalone option to install instructions by @cbalioglu in #30
Add LLaMA-2 implementation by @chaoweihuang in #28
Nit improvements to LLaMA and GQA by @cbalioglu in #32
Update change log and README by @cbalioglu in #34
Revise error messages and reprs by @cbalioglu in #33
Enforce kw-only parameters by @cbalioglu in #35
Revise keyword-only parameters by @cbalioglu in #37
Update S2T Conformer arch name by @cbalioglu in #38
Improve layer output hook API by @cbalioglu in #39
Accept asset card in LLaMA tokenizer loader by @cbalioglu in #40
Introduce restict_checkpoints parameter in ModelLoader by @cbalioglu in #41
fix module names in w2vbert loader by @chaoweihuang in #44
Fix state_bag parameter in MHA pos encoder by @cbalioglu in #46
Introduce to_bool_padding_mask by @cbalioglu in #45
Introduce pad_sequence by @cbalioglu in #47
Rename upgrade_checkpoint to convert_checkpoint by @cbalioglu in #49
Fix apply_padding_mask() for the corner case where seqs.ndim == 2. by @kauterry in #52
Update error messaging for "multiple of"s by @cbalioglu in #53
Expose DynamicLossScaler by @cbalioglu in #54
Complete the implementation of LLaMA tokenizer by @cbalioglu in #57
Fix casual mask flag by @cbalioglu in #59
Allow multiple batch dimennsions in SDPA by @cbalioglu in #58
Cache rotary encoder in LLaMA builder by @cbalioglu in #61
Allow multiple batch dimensions in position encoders by @cbalioglu in #60
Improve performance of MHA by @cbalioglu in #62
Introduce fast repeat_interleave by @cbalioglu in #63
Compute positional encoding always in fp32 by @cbalioglu in #65
Revise Rotary encoder implementation by @cbalioglu in #66
Dataloader Doc by @gwenzek in #64
Improve lazy model init by @cbalioglu in #68
Improve RelativePositionalEncoding by @cbalioglu in #69
Export LayerNorm by @cbalioglu in #70
Apply Bessel correction in fbank conversion by @cbalioglu in #71
Fix view op in relative attention by @cbalioglu in #72
Improve prefix handling in sequence generator by @cbalioglu in #73
Handle key padding mask reordering in beam search by @cbalioglu in #74
Add sample method to data_pipeline by @najielhachem in #20
Skip fairseq2n tests in devel mode by @cbalioglu in #77
Add a new "out" parameter to ModelLoader by @cbalioglu in #78
Introduce set_default_sdpa by @cbalioglu in #79
Require max_num_steps in IncrementalStateBag by @cbalioglu in #81
Refactor Embedding by @cbalioglu in #82
Introduce stop_at_shortest in sample and round_robin by @najielhachem in #76
Refactor MultiheadAttentionState by @cbalioglu in #89
Add normalization support for layer_norm within ConformerConvolution. by @kauterry in #92
Clean incremental decoding implementation in MHA by @cbalioglu in #93
Adding self_attn_mask to the Transformer Encoder Layer API. by @kauterry in #94
Introduce causal_depthwise_conv in the Conformer convolution module. by @kauterry in #95
Introduce ShawRelativePositionSDPA. by @kauterry in #90
add .collate for .map(Collater) by @gwenzek in #67
Introduce build_conformer_conv() and build_sdpa() in wav2vec2 builder. by @kauterry in #97
Use init_fn instead of skip_init by @cbalioglu in #98
Improve extra_repr by @cbalioglu in #99
Nit updates to Conformer by @cbalioglu in #100
Nit updates to ShawRelativePositionSDPA by @cbalioglu in #101
Rename layer_norm_fn to layer_norm_factory by @cbalioglu in #102
Improve padding and attention mask handling by @cbalioglu in #104
Fixing bug in create_default_sdpa() in wav2vec2 builder. by @kauterry in #105
Make padding_mask Optional in SequenceBatch. by @kauterry in #106
Introduce LocalAttentionState by @cbalioglu in #108
Concatenate method for DataPipeline class by @am831 in #84
Improvements to attention mask handling by @cbalioglu in #111
Introduce Mistral 7B by @cbalioglu in #112
Update AWS ARN by @cbalioglu in #113
Remove check_model_dim by @cbalioglu in #114
Introduce LoRA layers and wrappers by @chaoweihuang in #88
Rename pad_idx to pad_value in pad_seqs and Collater by @cbalioglu in #116
Remove norm eps from LLaMA by @cbalioglu in #117
Accept only non-batched tensor in TextTokenDecoder by @cbalioglu in #118
Introduce create_raw_encoder by @cbalioglu in #119
Conver layer_norm_hook to a PyTorch hook by @cbalioglu in #121
In Collater, handle empty bucket gracefully by @cbalioglu in #122
Versioning by @cbalioglu in #124
Minor reword of first note in Install From Source by @cbalioglu in #126
Rename bytes to bytes in CString by @cbalioglu in #125
Revise Dockerfiles by @cbalioglu in #127
Introduce support for PyTorch 2.1 by @cbalioglu in #130
Fix GitHub publish workflow by @cbalioglu in #131
Fixes PyTorch 2.1 compatibility issues by @cbalioglu in #132
Fix PyTorch 2.0.1 wheel build by @cbalioglu in #133
Fix CacheControl in S3 publish by @cbalioglu in #134
Revise LogitsProcessor by @cbalioglu in #135
Few nit docstr and cosmetic updates by @cbalioglu in #136
Move test_incremental_decode to generation by @cbalioglu in #137
Use future.annotations by @cbalioglu in https://github.com/face...