Greedy DAG Search to Estimate Markov Equivalence Class of DAG
Greedy DAG Search to Estimate Markov Equivalence Class of DAG
Estimate the observational or interventional essential graph representing the Markov equivalence class of a DAG by greedily optimizing a score function in the space of DAGs. In practice, greedy search should always be done in the space of equivalence classes instead of DAGs, giving the functions gies or ges the preference over gds.
UTF-8
labels: Node labels; by default, they are determined from the scoring object.
targets: A list of intervention targets (cf. details). A list of vectors, each vector listing the vertices of one intervention target.
fixedGaps: Logical symmetric matrix of dimension p*p. If entry [i, j] is TRUE, the result is guaranteed to have no edge between nodes i and j.
phase: Character vector listing the phases that should be used; possible values: forward, backward, and turning (cf. details).
iterate: Logical indicating whether the phases listed in the argument phase should be iterated more than once (iterate = TRUE) or not.
turning: Setting turning = TRUE is equivalent to setting phases = c("forward", "backward") and iterate = FALSE; the use of the argument turning is deprecated.
maxDegree: Parameter used to limit the vertex degree of the estimated graph. Valid arguments:
Vector of length 0 (default): vertex degree is not limited.
Real number r, 0\<r\<1: degree of vertex v is limited to r⋅nv, where nv denotes the number of data points where v was not intervened.
Single integer: uniform bound of vertex degree for all vertices of the graph.
Integer vector of length p: vector of individual bounds for the vertex degrees.
verbose: if TRUE, detailed output is provided.
...: additional arguments for debugging purposes and fine tuning.
Details
This function estimates the observational or interventional Markov equivalence class of a DAG based on a data sample with interventional data originating from various interventions and possibly observational data. The intervention targets used for data generation must be specified by the argument targets as a list of (integer) vectors listing the intervened vertices; observational data is specified by an empty set, i.e. a vector of the form integer(0). As an example, if data contains observational samples as well as samples originating from an intervention at vertices 1 and 4, the intervention targets must be specified as list(integer(0), as.integer(1), as.integer(c(1, 4))).
An interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph called interventional essential graph. Its edges have the following interpretation:
a directed edge a→b stands for an arrow that has the same orientation in all representatives of the interventional Markov equivalence class;
an undirected edge a -- b stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.
Note that when plotting the object, undirected and bidirected edges are equivalent.
Greedy DAG search (GDS) maximizes a score function (typically the BIC, passed to the function via the argument score) of a DAG in three phases, starting from the empty DAG:
Forward phase: In the forward phase, GDS adds single arrows to the DAG as long as this augments the score.
Backward phase: In the backward phase, the algorithm removes arrows from the DAG as long as this augments the score.
Turning phase: In the turning phase, the algorithm reverts arrows of the DAG as long as this augments the score.
The phases that are actually run are specified with the argument
phase. GDS cycles through the specified phases until no augmentation
of the score is possible any more if iterate = TRUE. In the end,
gds returns the (interventional or observational) essential graph of
the last visited DAG.
It is well-known that a greedy search in the space of DAGs instead of
essential graphs is more prone to be stuck in local optima of the score
function and hence expected to yield worse estimation results than GIES
(function gies) or GES (function ges) (Chickering,
2002; Hauser and Bühlmann, 2012). The
function gdsis therefore not of practical use, but can be used to compare causal inference algorithms to an elementary and straight-forward approach.
Returns
gds returns a list with the following two components: - essgraph: An object of class EssGraph containing an estimate of the equivalence class of the underlying DAG.
repr: An object of a class derived from ParDAG
containing a (random) representative of the estimated equivalence class.
References
D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3 , 507--554
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13 , 2409--2464.