mobydick dataset

Lemmatized Text of Moby-Dick (Chapters 1-10)

  • Maintainer: Massimo Aria
  • License: MIT + file LICENSE
  • Last published: 2025-12-12

About the dataset

  • Number of rows: 23548
  • Number of columns: 27
  • Class: data.frame

Column names and types (First 10)

  • doc_id:character
  • paragraph_id:integer
  • sentence_id:integer
  • sentence:character
  • start:integer
  • end:integer
  • term_id:integer
  • token_id:character
  • token:character
  • lemma:character