Package: morphemepiece Type: Package Title: Morpheme Tokenization Version: 1.2.3 Authors@R: c( person(given = "Jonathan", family = "Bratt", role = c("aut", "cre"), email = "jonathan.bratt@macmillan.com", comment = c(ORCID = "0000-0003-2859-0076")), person(given = "Jon", family = "Harmon", role = c("aut"), email = "jonthegeek@gmail.com", comment = c(ORCID = "0000-0003-4781-4346")), person(given = "Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning", role = c("cph")) ) Description: Tokenize text into morphemes. The morphemepiece algorithm uses a lookup table to determine the morpheme breakdown of words, and falls back on a modified wordpiece tokenization algorithm for words not found in the lookup table. URL: https://github.com/macmillancontentscience/morphemepiece BugReports: https://github.com/macmillancontentscience/morphemepiece/issues License: Apache License (>= 2) Encoding: UTF-8 RoxygenNote: 7.1.2 Roxygen: list(markdown = TRUE) Imports: dlr (>= 1.0.0), fastmatch, magrittr, memoise (>= 2.0.0), morphemepiece.data, piecemaker (>= 1.0.0), purrr (>= 0.3.4), readr, rlang, stringr (>= 1.4.0) Suggests: dplyr, fs, ggplot2, here, knitr, remotes, rmarkdown, testthat (>= 3.0.0), utils VignetteBuilder: knitr Config/testthat/edition: 3 Config/pak/sysreqs: cmake make libicu-dev libuv1-dev libx11-dev Repository: https://jonthegeek.r-universe.dev Date/Publication: 2022-04-15 21:02:22 UTC RemoteUrl: https://github.com/macmillancontentscience/morphemepiece RemoteRef: HEAD RemoteSha: bc071b1a03226b2441c431d263982f862e4dc7fd NeedsCompilation: no Packaged: 2026-07-04 01:41:01 UTC; root Author: Jonathan Bratt [aut, cre] (ORCID: ), Jon Harmon [aut] (ORCID: ), Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph] Maintainer: Jonathan Bratt