Skip to content

multi gpu training in one machine for BERT from scratch

License

Notifications You must be signed in to change notification settings

rokid/BERT-multi-GPU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BERT MULTI GPU

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

REQUIREMENT

python 3

tensorflow 1.12.0

TRAINING

0, edit the input and output file name in create_pretraining_data.py and run_pretraining_gpu_v2.py

1, run create_pretraining_data.py

2, run run_pretraining_gpu_v2.py

PARAMETERS

Edit n_gpus in run_pretraining_gpu_v2.py

DATA

In sample_text.txt, sentence is end by \n, paragraph is splitted by empty line.

EXPERIMENT RESULT

Quora question pairs English dataset,

Official BERT: ACC 91.2, AUC 96.9

This BERT with pretrain loss 2.05: ACC 90.1, AUC 96.3

WHY MUST TRAIN FROM SCRATCH

For inference speed research.

About

multi gpu training in one machine for BERT from scratch

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages