How to represent deterministic nodes in HierarchicalCausalModel class #271

adamrupe · 2025-02-07T01:30:21Z

Lines 236 to 237 in 1fed7cc

    
           # TODO what's this for? Is it used besides making diagrams? 
        
           # hscm.set_shape(node, "square")

A key concept in hierarchical causal models is the distinction between a stochastic variable and a deterministic variable.
The nodes added during the augmentation routine, by default, are deterministic. Deterministic variables can, in some circumstances, be converted to stochastic variables via parent marginalization if the parent is a nuisance variable (is not explicitly mentioned in the causal query) that has no other children.

Importantly, stochastic variables satisfy positivity (common support), which says that every value of its domain is observed or observable no matter what its parents' values are. Deterministic variables may violate positivity, which may be necessary for certain causal queries on hierarchical models. For instance, subunit treatment variables must satisfy subunit positivity, and mediators (confounders) must satisfy positivity if they are used in a front door (back door) adjustment. Outcome variables need not satisfy positivity.

In the HCM paper v1, deterministic nodes are identified via double arrows from their parents. The authors shifted to representing deterministic nodes with square shapes in later versions / presentations. Pyro uses dashed outlines of nodes to represent deterministic nodes. In the lines linked above, we used square shapes to represent deterministic nodes in HSCM. We had not yet added a representation for deterministic variables in NxMixedGraphs of augmented models, although we were planning to.

Potential solutions:

create a node attribute that labels it as deterministic or stochastic (stochastic by default)
create a DeterministicVariable subclass of Variable
create an attribute of HierarchicalCausalModel that is a list of deterministic nodes

cthoyt · 2025-02-07T09:44:26Z

the 2nd generation solution for keeping track of subunits and observed variables is to just keep track of the set of them. we can also add an additional set to keep track of which variables are stochastic - see 9b04642

adamrupe · 2025-02-19T20:51:49Z

This was my preferred solution as well, although I changed from stochastic to deterministic; since stochastic is the default it is more efficient to track fewer deterministic variables.
The main issue with this solution is that we would also need to add a deterministic attribute to the NxMixedGraph class. We are mainly concerned with deterministic variables being created / added during augmentation, which occurs on collapsed models that are represented as NxMixedGraphs.

We will eventually need some kind of check_positivity function that acts on augmented and marginalized models, which are represented as NxMixedGraphs. If we have a DeterministicVariable sublcass of Variable, we will have to first loop through all model variables and check if they are deterministic, whereas if we have a NxMixedGraph.deterministic attribute, we can just loop through that. These models will not be very large, so this shouldn't be much of an issue.

adamrupe assigned adamrupe, djinnome and cthoyt Feb 7, 2025

adamrupe added the hierarchical causal models label Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to represent deterministic nodes in HierarchicalCausalModel class #271

How to represent deterministic nodes in HierarchicalCausalModel class #271

adamrupe commented Feb 7, 2025

cthoyt commented Feb 7, 2025

adamrupe commented Feb 19, 2025

How to represent deterministic nodes in HierarchicalCausalModel class #271

How to represent deterministic nodes in HierarchicalCausalModel class #271

Comments

adamrupe commented Feb 7, 2025

cthoyt commented Feb 7, 2025

adamrupe commented Feb 19, 2025