Package: wordpiece Type: Package Title: R Implementation of Wordpiece Tokenization Version: 2.1.3 Authors@R: c( person(given = "Jonathan", family = "Bratt", role = c("aut", "cre"), email = "jonathan.bratt@macmillan.com", comment = c(ORCID = "0000-0003-2859-0076")), person(given = "Jon", family = "Harmon", role = c("aut"), email = "jonthegeek@gmail.com", comment = c(ORCID = "0000-0003-4781-4346")), person(given = "Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning", role = c("cph")) ) Description: Apply 'Wordpiece' () tokenization to input text, given an appropriate vocabulary. The 'BERT' () tokenization conventions are used by default. Encoding: UTF-8 URL: https://github.com/macmillancontentscience/wordpiece BugReports: https://github.com/macmillancontentscience/wordpiece/issues Depends: R (>= 3.3.0) License: Apache License (>= 2) RoxygenNote: 7.1.2 Roxygen: list(markdown = TRUE) Imports: dlr (>= 1.0.0), fastmatch (>= 1.1), memoise (>= 2.0.0), piecemaker (>= 1.0.0), rlang, stringi (>= 1.0), wordpiece.data (>= 1.0.2) Suggests: covr, knitr, rmarkdown, testthat (>= 3.0.0) VignetteBuilder: knitr Config/testthat/edition: 3 Config/pak/sysreqs: cmake make libicu-dev libuv1-dev Repository: https://jonthegeek.r-universe.dev Date/Publication: 2022-03-03 14:09:42 UTC RemoteUrl: https://github.com/macmillancontentscience/wordpiece RemoteRef: HEAD RemoteSha: 3eb92c759556e89d235202c45decb2dc859e661d NeedsCompilation: no Packaged: 2026-06-01 11:40:14 UTC; root Author: Jonathan Bratt [aut, cre] (ORCID: ), Jon Harmon [aut] (ORCID: ), Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph] Maintainer: Jonathan Bratt