selectCases function

Select the cases/configurations compatible with a data generating causal structure

Select the cases/configurations compatible with a data generating causal structure

selectCases selects the cases/configurations that are compatible with a Boolean function, in particular (but not exclusively), a data generating causal structure, from a data frame or configTable.

selectCases1 allows for setting standard consistency (con) and coverage (cov) thresholds (i.e. none of the other measures can be used, see cna). It then selects cases/configurations that are compatible with the data generating structure to degrees con and cov.

selectCases(cond, x = full.ct(cond), type = "auto", cutoff = 0.5, rm.dup.factors = FALSE, rm.const.factors = FALSE) selectCases1(cond, x = full.ct(cond), type = "auto", con = 1, cov = 1, rm.dup.factors = FALSE, rm.const.factors = FALSE)

Arguments

  • cond: Character string specifying the Boolean function for which compatible cases are to be selected.
  • x: Data frame or configTable; if not specified, full.ct(cond) is used.
  • type: Character vector specifying the type of x: "auto" (automatic detection; default), "cs" (crisp-set), "mv" (multi-value), or "fs" (fuzzy-set).
  • cutoff: Cutoff value in case of "fs" data. Cases with a membership score equal to or greater than cutoff are selected.
  • rm.dup.factors: Logical; if TRUE, all but the first of a set of factors with identical value distributions are eliminated.
  • rm.const.factors: Logical; if TRUE, constant factors are eliminated.
  • con, cov: Numeric scalars between 0 and 1 to set the minimum thresholds on standard consistency and coverage.

Details

In combination with allCombs, full.ct, randomConds and makeFuzzy, selectCases is useful for simulating data, which are needed for inverse search trials benchmarking the output of the cna function.

selectCases draws those cases/configurations from a data frame or configTable x that are compatible with a data generating causal structure (or any other Boolean or set-theoretic function), which is given to selectCases as a character string cond. If the argument x is not specified, configurations are drawn from full.ct(cond). cond can be a condition of any of the three types of conditions, boolean, atomic or complex (see condition). To illustrate, if the data generating structure is "A + B <-> C", then a case featuring A=1, B=0, and C=1 is selected by selectCases, whereas a case featuring A=1, B=0, and C=0 is not (because according to the data generating structure, A=1 must be associated with C=1, which is violated in the latter case). The type of the data is automatically detected by default, but can be manually specified by setting the argument type to one of its non-default values: "cs" (crisp-set), "mv" (multi-value), and "fs" (fuzzy-set).

selectCases1 allows for providing thresholds on standard consistency (con) and coverage (cov), such that some cases that are incompatible with cond are also drawn, as long as con and cov remain satisfied. No other evaluation measures can be selected from showConCovMeasures. The solution is identified by an algorithm aiming to find a subset of maximal size meeting the con and cov requirements. In contrast to selectCases, selectCases1 only accepts a condition of type atomic as its cond argument, i.e. an atomic solution formula.

Returns

A configTable.

See Also

allCombs, full.ct, randomConds, makeFuzzy, configTable, condition, cna, d.jobsecurity, showConCovMeasures

Examples

# Generate all configurations of 5 dichotomous factors that are compatible with the causal # chain (A*b + a*B <-> C) * (C*d + c*D <-> E). groundTruth.1 <- "(A*b + a*B <-> C) * (C*d + c*D <-> E)" (dat1 <- selectCases(groundTruth.1)) condition(groundTruth.1, dat1) # Randomly draw a multi-value ground truth and generate all configurations compatible with it. dat1 <- allCombs(c(3, 3, 4, 4, 3)) groundTruth.2 <- randomCsf(dat1, n.asf=2) (dat2 <- selectCases(groundTruth.2, dat1)) condition(groundTruth.2, dat2) # Generate all configurations of 5 fuzzy-set factors compatible with the causal structure # A*b + C*D <-> E, such that con = .8 and cov = .8. dat1 <- allCombs(c(2, 2, 2, 2, 2)) - 1 dat2 <- makeFuzzy(dat1, fuzzvalues = seq(0, 0.45, 0.01)) (dat3 <- selectCases1("A*b + C*D <-> E", con = .8, cov = .8, dat2)) condition("A*b + C*D <-> E", dat3) # Inverse search for the data generating causal structure A*b + a*B + C*D <-> E from # fuzzy-set data with non-perfect consistency and coverage scores. dat1 <- allCombs(c(2, 2, 2, 2, 2)) - 1 set.seed(7) dat2 <- makeFuzzy(dat1, fuzzvalues = 0:4/10) dat3 <- selectCases1("A*b + a*B + C*D <-> E", con = .8, cov = .8, dat2) cna(dat3, outcome = "E", con = .8, cov = .8) # Draw cases satisfying specific conditions from real-life fuzzy-set data. ct.js <- configTable(d.jobsecurity) selectCases("S -> C", ct.js) # Cases with higher membership scores in C than in S. selectCases("S -> C", d.jobsecurity) # Same. selectCases("S <-> C", ct.js) # Cases with identical membership scores in C and in S. selectCases1("S -> C", con = .8, cov = .8, ct.js) # selectCases1() makes no distinction # between "->" and "<->". condition("S -> C", selectCases1("S -> C", con = .8, cov = .8, ct.js)) # selectCases() not only draws cases compatible with Boolean causal models. Any Boolean # function of factor values appearing in the data can be given as cond. selectCases("C=1*B=3", allCombs(2:4)) selectCases("A=1 * !(C=2 + B=3)", allCombs(2:4), type = "mv") selectCases("A=1 + (C=3 <-> B=1)*D=3", allCombs(c(3,3,3,3)), type = "mv")