-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Dark Secrets of BERT | Text Machine Blog #1
Comments
Fascinating, useful stuff! I have a question about the last study - were attention heads ablated before fine-tuning, so just the LM properties were affected, or after fine-tuning (but before test)? |
I'm curious to know what the experimental results would be like on the SuperGLUE Benchmark, as it's supposed to be somewhat liberated from the biases and artifacts which are probably being exploited by randomly initialized BERT on the standard GLUE datasets. https://super.gluebenchmark.com/ |
@ruanchaves I agree, that would be indeed an interesting experiment to run, at the time of our submission we only had GLUE available. |
Great summarization here. |
Are your pretrained and finetuned BERT models available for independent analysis ? |
@charlesmartin14 we haven't released the models themselves, but we consistently used the scripts provided by huggingface to train all of the models. |
Wow the greatest article of the year! |
Thanks a lot for the paper summarization. |
The Dark Secrets of BERT | Text Machine Blog
BERT and its Transformer-based cousins are still ahead on all NLP leaderboards. But how much do they actually understand about language?
https://text-machine-lab.github.io/blog/2020/bert-secrets/
The text was updated successfully, but these errors were encountered: