Package 'morphemepiece.data'

Title: Data for Morpheme Tokenization
Description: Provides data about morphemes, the smallest units of meaning in a language.
Authors: Jonathan Bratt [aut] , Jon Harmon [aut, cre] , Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]
Maintainer: Jon Harmon <[email protected]>
License: Apache License (>= 2)
Version: 1.2.0
Built: 2024-10-31 20:27:37 UTC
Source: https://github.com/macmillancontentscience/morphemepiece.data

Help Index


Load a Morphemepiece Lookup

Description

A morphemepiece lookup is a named character vector. The names of the vector are the words, and the values are the space-separated morpheme breakdowns of those words.

Usage

morphemepiece_lookup()

Value

A named character vector.

Examples

head(morphemepiece_lookup())

Load a Morphemepiece Vocabulary

Description

A morphemepiece vocabulary is a named integer vector with class "morphemepiece_vocabulary". The names of the vector are the morphemes, and the values are the integer identifiers of those tokens. The vocabulary is 0-indexed for compatibility with Python implementations.

Usage

morphemepiece_vocab()

Value

A morphemepiece_vocabulary.

Examples

head(morphemepiece_vocab())