distilCamemBERT

This code aims to distill CamemBERT, a french model based on RoBERTa, to have a smaller and faster model with hopefully equivalent performances. It does this by using the same approach as DistilBERT with BERT: the distilled model has half the number of layer that the teacher model has.

Using only the distillation function

If you want to use a distillated model as a normal huggingface model, you can do so by using the .distill @classmethod of DistilledCamembertForSequenceClassification on a CamemBERT model, regardless of its size.
This is recommended if you want to train your student model like a regular model, to achieve good metrics in a task without regards for imitating the teacher. However, if you want to go for the full distillation process as intented, it is not recommended.

For full distillation process

This is the recommended way to use this repository. To train the student as was done in DistilBERT, with a loss based on classification itself and on imitating the teacher, follow the same template as in demo.ipynb .
We recommend checking that notebook for having a good understanding of the pipeline's functionment.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dataset		dataset
logger		logger
metrics		metrics
models		models
module		module
outputs		outputs
README.md		README.md
demo.ipynb		demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

distilCamemBERT

Using only the distillation function

For full distillation process

About

Releases

Packages

Languages

remi-or/distillCamembertForSequenceClassification

Folders and files

Latest commit

History

Repository files navigation

distilCamemBERT

Using only the distillation function

For full distillation process

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages