Before talking about how confounding should be defined, let’s talked about how it was defined before.

# Some declarative defintions

A confounder is any variable that is correlated with both X and Y.

A confounder X and Y is a variable Z that is (1) associated with X in the population at large, and (2) associated with Y among people who have not been exposed to the treatment X. (3) Z should not be on the causal path between X and Y (Recent years’ supplement).

Consider this example: X –> Z –> Y.

Z is correlated to X and Y. But it is not a confounder. It is the variable that explains the causal effect of X on Y. It is a disaster if you control for Z. This example also fulfills criteria (1) and (2) in the second definition.

# Some procedural definitions

If you suspect a confounder, try adjusting for it and try not to adjusting for it. If there is a difference, it is a confounder, and you should trust the adjusted value. If there is no difference, you are off the hook.

Consider the sample example as above: X –> Z –> Y.

If you suspect Z is a confounder, adjusting for it shows X is uncorrelated with Y. Not adjusting for it shows X is correlated with Y. Then you will trust the adjusted value. But Z is not a confounder. Disaster gain.