Home
Packages
Datasets
Task Views
R resources
Packages
Toggle theme
Toggle Menu
Home
Datasets
mobydick
mobydick dataset
Lemmatized Text of Moby-Dick (Chapters 1-10)
tall package
Read PDF manual
Maintainer: Massimo Aria
License: MIT + file LICENSE
Last published: 2025-12-12
About the dataset
Number of rows: 23548
Number of columns: 27
Class: data.frame
Column names and types
(First 10)
doc_id:
character
paragraph_id:
integer
sentence_id:
integer
sentence:
character
start:
integer
end:
integer
term_id:
integer
token_id:
character
token:
character
lemma:
character