Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augmentation Mechanisms #256

Open
adamrupe opened this issue Oct 29, 2024 · 0 comments
Open

Augmentation Mechanisms #256

adamrupe opened this issue Oct 29, 2024 · 0 comments

Comments

@adamrupe
Copy link
Collaborator

Overview

  1. First, we argue that the augmentation mechanism described in Appendix J.1, as far as we understand it, is not sufficient for more complicated subunit graphs.
  2. We then introduce a more general formalism for augmentation mechanisms. This includes the use of intermediary variables representing joint distributions over subunit variables, as well as decomposing the mechanism function into two distinct pieces.
  3. Open question: from the new formalism, it appears that augmentation mechanisms should generally include ALL subunit ancestors, not just subunit direct ancestors. Is this correct?

Despite confusion about Appendix J.1 (see issue #255), it seems clear that the general form of augmentation mechanisms stated by the authors include the ($Q$ variables of) direct subunit ancestors of the subunit variable being augmented. From our understanding of augmentation, this is not valid for arbitrary subunit graphs.

Dependence on non-direct subunit ancestors

For simplicity, consider a subunit graph with a chain structure, A -> Z -> W, and say that we want to augment $Q^w$ (the variable labels here are arbitrary).
The HCGM is:
chain_HCGM
and the collapsed model is:
chain_col

We now want to augment $Q^w$. The direct subunit ancestor of W is Z, so from (69), I am reading the mechanism as:
$Q^w = f(Q^{w|z}, Q^{z|a})$. However, this does not seem sufficient to specify the distribution $\Pr(W)$ without any information on A.
If we work out the probabilities we find that:
$\Pr(W) = \int \int \Pr(A,W,Z) dA dZ$, and we can write $\Pr(A,W,Z) = \Pr(A)\Pr(W,Z | A) = \Pr(A) \Pr(Z|A) \Pr(W| A,Z)$. Now, the chain A -> Z -> W in the subunit graph implies that A and W are independent conditioned on Z. So we can drop A from the last conditional and write $\Pr(A,W,Z) = \Pr(A)\Pr(Z|A)\Pr(W|Z)$. This is now stated in terms of promoted $Q$ variables in the collapsed model.
So the mechanism for $Q^w$ should be $Q^w = f(Q^a, Q^{z|a}, Q^{w|z})$.
chain_augmented1

Note that we could also chain together two augmentation variables (as done in Figure A3 (n) in the Appendix). In this case, we first augment $Q^z$ with parents / mechanism $Q^a$ and $Q^{z|a}$. Then we can augment $Q^w$ as $Q^w = f(Q^{w|z}, Q^z)$. If we do this double augmentation, we get a different augmented graph, but due to the deterministic relations between augmented variables and their parents, it is equivalent to the above augmented graph:
chain_augmented2
In particular, the augmented variable we really care about, $Q^w$, still has a dependence on $Q^a$, this time through $Q^z$.

In both cases, we are left with a dependence on $A$, despite $A$ not being a subunit direct ancestor of W in the original HCM.

New formalism for augmentation mechanisms

Because the creation of joint distributions is the crucial step in identifying which variables belong in an augmentation mechanism, it seems to make the most sense to explicitly represent joint distributions as some kind of "intermediary" variables in the collapsed model.
We therefore separate augmentation mechanisms into two distinct (deterministic) components.
Originally, augmentation mechanisms are given as e.g. $Q^x = f(\cdot)$, which represents the combination of creating a joint variable and then marginalizing everything except $X$. This can be equivalently given as the composition of two functions $f = h \circ g$, where $g$ creates a joint distribution and $h$ marginalizes everything except $X$.

Consider the following subunit graph with a diamond motif:
diamond
The promoted $Q$ variables in the collapsed model are:
$Q^c$
$Q^{b|c}$
$Q^{d|c}$
$Q^{a | b,d}$
Assume that we want the marginal subunit distribution over $A$, so we want to augment $Q^a$.
From the chain rule we want:
$\Pr(A) = \int\int \Pr(A, B, D) dB dD = \int \int \Pr(A| B,D) \Pr(B,D) dB dD$
since we have $\Pr(A | B,D)$. This then requires the joint distribution $\Pr(B, D)$, which cannot be decomposed since $B$ and $D$ are not independent due to $C$. However, from the conditional independence of $B$ and $D$ given $C$, we have that $\Pr(B, C, D) = \Pr(B|C) \Pr(D|C) \Pr(C)$.
And so we create the intermediary joint variable $Q^{b,c,d} = g(Q^{b|c}, Q^{d|c}, Q^c)$. We can then marginalize out $C$ to get another intermediary joint variable $Q^{b,d} = h(Q^{b,c,d})$. From the chain rule, we can create a final intermediary variable as $Q^{a,b,d} = g(Q^{a|b,d}, Q^{b,d})$, and then arrive at our desired augmentation variable $Q^a = h(Q^{a,b,d})$.
This is shown graphically as follows:

  • Grey circle nodes are the original promoted $Q$ variables from collapsing the HCM
  • Diamond nodes are intermediary joint variables
  • Square nodes are marginal augmentation variables
  • Blue nodes are determined from their parents according to the product function $g$
  • Red nodes are determined from their parent by the marginalization function $h$

diamond_2joints

This formalism with intermediary joint variables is a more intuitive approach to creating augmentation mechanisms, and has a lot of flexibility following the chain rule. For example, a straightforward thing to do is to simply consider the joint distribution over all subunit variables. From the chain rule and conditional independencies of the subunit graph, we have the following for the diamond motif:
diamond_1joint

open question: is the joint distribution over all subunit variables (that are not fully independent of the desired augmentation variable $X$) given by the promoted $Q$ variables of all of the subunit ancestors (including non-direct) of $X$?

This appears to be the case for all examples we've looked at, but need to investigate it more rigorously. It would have to follow from the DAG structure of the subunit graph and the resulting conditional (in)dependencies.

Because deterministic variables (diamond and square nodes) are deterministic functions of their parents, we can just remove them as needed and connect their parents to their children.

To close, let's revisit the first example above: a subgraph chain A -> Z -> W and we want to augment W. Both cases, with the single augmented $Q^w$ and with the intermediate augmented $Q^z$ follow from different applications of the chain rule in the new formalism. As with the diamond motif example, we can equivalently consider the full joint distribution over the subunits, $\Pr(A, W, Z)$, or the two joint distributions $\Pr(A,Z)$ and $\Pr(Z, W)$. The two different graphs in the new formalism are given as:

chain_1joint
for the full joint distribution over subunits,
and

chain_2joints
for the two separate joint distributions.
Erasing the intermediary joint distributions (diamond nodes) recovers the two augmented graphs given above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants