NEWS
wordpiece 2.1.3 (2022-03-03)
wordpiece 2.0.0
- Refactored
wordpiece_tokenize
to accept a character vector with length > 1. This makes the package more usable within a workflow, but will break scripts that used the previous version (the output is now a list of character vectors, instead of a single character vector). (@jonthegeek)
- Added a pair of default vocabularies via the {wordpiece.data} package. (@jonthegeek)
wordpiece 1.0.0
wordpiece 0.0.6
wordpiece 0.0.5
wordpiece 0.0.4
- Added basic usage vignette.
wordpiece 0.0.3
- Enabled cache option to speed up vocabulary loading.
wordpiece 0.0.1
- Added a
NEWS.md
file to track changes to the package.