Skip to content

v2.2.0 - T5 Encoder & Private models

Compare
Choose a tag to compare
@nreimers nreimers released this 10 Feb 13:12
· 540 commits to master since this release

T5

You can now use the encoder from T5 to learn text embeddings. You can use it like any other transformer model:

from sentence_transformers import SentenceTransformer, models
word_embedding_model = models.Transformer('t5-base', max_seq_length=256)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

See T5-Benchmark results - the T5 encoder is not the best model for learning text embeddings models. It requires quite a lot of training data and training steps. Other models perform much better, at least in the given experiment with 560k training triplets.

New Models

The models from the papers Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models and Large Dual Encoders Are Generalizable Retrievers have been added:

For benchmark results, see https://seb.sbert.net

Private Models

Thanks to #1406 you can now load private models from the hub:

model = SentenceTransformer("your-username/your-model", use_auth_token=True)