occurs function

Finding Subsequences

Finding Subsequences

Counts items, or finds subsequences of (integer) sequences.

count(x, sorted = TRUE) occurs(subseq, series)

Arguments

  • x: array of items, i.e. numbers or characters.
  • sorted: logical; default is to sort items beforehand.
  • subseq: vector of integers.
  • series: vector of integers.

Details

count counts the items, similar to table, but as fast and a more tractable output. If sorted then the total number per item will be counted, else per repetition.

If m and n are the lengths of s and S resp., then occurs(s, S) determines all positions i such that s == S[i, ..., i+m-1].

The code is vectorized and relatively fast. It is intended to complement this with an implementation of Rabin-Karp, and possibly Knuth-Morris-Pratt and Boyer-Moore algorithms.

Returns

count returns a list with components v the items and e the number of times it apears in the array. occurs returns a vector of indices, the positions where the subsequence appears in the series.

Examples

## Examples patrn <- c(1,2,3,4) exmpl <- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4) occurs(patrn, exmpl) ## [1] 6 13 23 ## Not run: set.seed(2437) p <- sample(1:20, 1000000, replace=TRUE) system.time(i <- occurs(c(1,2,3,4,5), p)) #=> [1] 799536 ## user system elapsed ## 0.017 0.000 0.017 [sec] system.time(c <- count(p)) ## user system elapsed ## 0.075 0.000 0.076 print(c) ## $v ## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ## $e ## [1] 49904 50216 49913 50154 49967 50045 49747 49883 49851 49893 ## [11] 50193 50024 49946 49828 50319 50279 50019 49990 49839 49990 ## End(Not run)
  • Maintainer: Hans W. Borchers
  • License: GPL (>= 3)
  • Last published: 2023-10-26

Useful links