Skip to content

v0.2.6 - Transformers Update - AutoModel - WKPooling

Compare
Choose a tag to compare
@nreimers nreimers released this 16 Apr 14:12
· 1461 commits to master since this release

The release update huggingface/transformers to the release v2.8.0.

New Features

  • models.Transformer: The Transformer-Model can now load any huggingface transformers model, like BERT, RoBERTa, XLNet, XLM-R, Elextra... It is based on the AutoModel from HuggingFace. You now longer need the architecture specific models (like models.BERT, models.RoBERTa) any more. It also works with the community models.
  • Multilingual Training: Code is released for making mono-lingual sentence embeddings models mutli-lingual. See training_multilingual.py for an example. More documentation and details will follow soon.
  • WKPooling: Adding a pytorch implementation of SBERT-WK. Note, due to an inefficient implementation in pytorch of QR decomposition, WKPooling can only be run on the CPU, which makes it about 40 slower than mean pooling. For some models WKPooling improves the performance, for other don't.
  • WeightedLayerPooling: A new pooling layer that uses representations from all transformer layers and learns a weighted sum of them. So far no improvement compared to only averaging the last layer.
  • New pre-trained models released. Every available model is document in a google Spreadsheet for an easier overview.

Minor changes

  • Clean-up of the examples folder.
  • Model and tokenizer arguments can now be passed to the according transformers models.
  • Previous version had some issues with RoBERTa and XLM-RoBERTa, that the wrong special characters were added. Everything is fixed now and relies on huggingface transformers for the correct addition of special characters to the input sentences.

Breaking changes

  • STSDataReader: The default parameter values have been changed, so that it expects the sentences in the first two columns and the score in the third column. If you want to load the STS benchmkark dataset, you can use the STSBenchmarkDataReader.