PGRdup R package [Documentation]

AddProbDup

Add probable duplicate sets fields to the PGR passport database

DataClean

Clean PGR passport data

DisProbDup

Get disjoint probable duplicate sets

DoubleMetaphone

'Double Metaphone' phonetic algorithm

KWCounts

Generate keyword counts

KWIC

Create a KWIC index

MergeKW

Merge keyword strings

MergeProbDup

Merge two objects of class ProbDup

ParseProbDup

Parse an object of class ProbDup to a data frame.

Pup-package.Rd

The PGRdup Package

print.KWIC

Prints summary of KWIC object.

print.ProbDup

Prints summary of ProbDup object.

ProbDup

Identify probable duplicates of accessions

read.genesys

Convert 'Darwin Core - Germplasm' zip archive to a flat file

ReconstructProbDup

Reconstruct an object of class ProbDup

ReviewProbDup

Retrieve probable duplicate set information from PGR passport database...

SplitProbDup

Split an object of class ProbDup

ValidatePrimKey

Validate if a data frame column confirms to primary key/ID constraints

ViewProbDup

Visualize the probable duplicate sets retrieved in a ProbDup object

Download source package Read PDF manual

Provides functions to aid the identification of probable/possible duplicates in Plant Genetic Resources (PGR) collections using 'passport databases' comprising of information records of each constituent sample. These include methods for cleaning the data, creation of a searchable Key Word in Context (KWIC) index of keywords associated with sample records and the identification of nearly identical records with similar information by fuzzy, phonetic and semantic matching of keywords.

Maintainer: J. Aravind
License: GPL-2 | GPL-3
Last published: 2025-12-14

Useful links

PGRdup0.2.4.0 package

Functions

Readme

Datasets

Dependencies

Imports

Versions

News