This crate provides a subword tokenizer. A subword tokenizer splits a token into several pieces, so-called word pieces. Word pieces were popularized by and used in the BERT natural language encoder.
-
Notifications
You must be signed in to change notification settings - Fork 0
Split tokens into word pieces
License
Apache-2.0, MIT licenses found
Licenses found
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
danieldk/wordpieces
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Split tokens into word pieces
Topics
Resources
License
Apache-2.0, MIT licenses found
Licenses found
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Stars
Watchers
Forks
Packages 0
No packages published