Package: piecemaker 1.0.2.9000

piecemaker: Tools for Preparing Text for Tokenizers

Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.

Authors:Jon Harmon [aut, cre], Jonathan Bratt [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]

piecemaker_1.0.2.9000.tar.gz
piecemaker_1.0.2.9000.zip(r-4.7)piecemaker_1.0.2.9000.zip(r-4.6)piecemaker_1.0.2.9000.zip(r-4.5)
piecemaker_1.0.2.9000.tgz(r-4.6-any)piecemaker_1.0.2.9000.tgz(r-4.5-any)
piecemaker_1.0.2.9000.tar.gz(r-4.7-any)piecemaker_1.0.2.9000.tar.gz(r-4.6-any)
piecemaker_1.0.2.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
piecemaker/json (API)
NEWS

# Install 'piecemaker' in R:
install.packages('piecemaker', repos = c('https://jonthegeek.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/macmillancontentscience/piecemaker/issues

Pkgdown/docs site:https://macmillancontentscience.github.io

On CRAN:

Conda:

3.48 score 2 packages 7 scripts 278 downloads 10 exports 8 dependencies

Last updated from:b02c1a7492. Checks:9 OK. Indexed: no.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK124
source / vignettesOK173
linux-release-x86_64OK110
macos-release-arm64OK96
macos-oldrel-arm64OK94
windows-develOK84
windows-releaseOK73
windows-oldrelOK55
wasm-releaseOK114

Exports:prepare_and_tokenizeprepare_textremove_control_charactersremove_diacriticsremove_replacement_charactersspace_cjkspace_punctuationsquish_whitespacetokenize_spacevalidate_utf8

Dependencies:cligluelifecyclemagrittrrlangstringistringrvctrs