Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to represent deterministic nodes in HierarchicalCausalModel class #271

Open
adamrupe opened this issue Feb 7, 2025 · 2 comments
Open

Comments

@adamrupe
Copy link
Collaborator

adamrupe commented Feb 7, 2025

y0/src/y0/hierarchical.py

Lines 236 to 237 in 1fed7cc

# TODO what's this for? Is it used besides making diagrams?
# hscm.set_shape(node, "square")

A key concept in hierarchical causal models is the distinction between a stochastic variable and a deterministic variable.
The nodes added during the augmentation routine, by default, are deterministic. Deterministic variables can, in some circumstances, be converted to stochastic variables via parent marginalization if the parent is a nuisance variable (is not explicitly mentioned in the causal query) that has no other children.

Importantly, stochastic variables satisfy positivity (common support), which says that every value of its domain is observed or observable no matter what its parents' values are. Deterministic variables may violate positivity, which may be necessary for certain causal queries on hierarchical models. For instance, subunit treatment variables must satisfy subunit positivity, and mediators (confounders) must satisfy positivity if they are used in a front door (back door) adjustment. Outcome variables need not satisfy positivity.

In the HCM paper v1, deterministic nodes are identified via double arrows from their parents. The authors shifted to representing deterministic nodes with square shapes in later versions / presentations. Pyro uses dashed outlines of nodes to represent deterministic nodes. In the lines linked above, we used square shapes to represent deterministic nodes in HSCM. We had not yet added a representation for deterministic variables in NxMixedGraphs of augmented models, although we were planning to.

Potential solutions:

  • create a node attribute that labels it as deterministic or stochastic (stochastic by default)
  • create a DeterministicVariable subclass of Variable
  • create an attribute of HierarchicalCausalModel that is a list of deterministic nodes
@cthoyt
Copy link
Member

cthoyt commented Feb 7, 2025

the 2nd generation solution for keeping track of subunits and observed variables is to just keep track of the set of them. we can also add an additional set to keep track of which variables are stochastic - see 9b04642

@adamrupe
Copy link
Collaborator Author

This was my preferred solution as well, although I changed from stochastic to deterministic; since stochastic is the default it is more efficient to track fewer deterministic variables.
The main issue with this solution is that we would also need to add a deterministic attribute to the NxMixedGraph class. We are mainly concerned with deterministic variables being created / added during augmentation, which occurs on collapsed models that are represented as NxMixedGraphs.

We will eventually need some kind of check_positivity function that acts on augmented and marginalized models, which are represented as NxMixedGraphs. If we have a DeterministicVariable sublcass of Variable, we will have to first loop through all model variables and check if they are deterministic, whereas if we have a NxMixedGraph.deterministic attribute, we can just loop through that. These models will not be very large, so this shouldn't be much of an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants