Is it possible that we can estimate the potential outcome — The outcome if we do things differently? We can observe what had happened., but not the counterfactual world.

The formula can be read as the value that Y would have taken for individual u** had X been** assigned the value x.

The answer to the question is YES. We can estimate something that had never happened if we have a model describing the causal relationships between different exposures, variables and an outcome.

The value that Y would have taken for individual u had X been assigned that value x is equal to setting/ interventing X=x in the model (No more arrows pointing into X).

Let’s study an example in order to illustrate how to do it and compare it with two other commonly used methods.

Ex = Years of Experience; ED = Education (0=secondary school, 1 = college, 2=graduate); S = Salary (S0=Salary if secondary school, similar for S1, S2)

Assumed that salary is affected by experience(EX) and education(ED). What if Bert had a graduate degree? In today’s machine learning practice, we are familiar with the technique to **impute missing data** by comparing similar groups. Caroline matched perfectly with Bert’s experience. Then, Bert salary will be the same as Caroline if she had a graduate degree.

What’s wrong with this approach? It does not take into account the underlying mechanism. Let’s say experience depends on education. If Caroline had the same experience as Bert but earned an extra degree, she is likely to be smarter than Bert. If Caroline had only one degree, she can use the extra time to gain more experience. So, the salary of Caroline will be higher than Bert.

There is another commonly used approach:** Linear Regression**. After we fit the data, we get the following line:

To estimate Alice salary if she had a college degree, we plug the number into the equation and her salary will be $65,000.

What’s wrong with this approach? It again does not take into account the underlying mechanism. Considering the following causal diagram:

In this model education also affects the experience. Therefore, the salary of Alice would be higher than $65,000 if she had a college degree. U in the below formula is an unobserved variable.

Although this is the same formula as the regression line, the information is embedded is different. Here, the coefficient means S is listening to EX, ED. The absence of formula on ED means it is not affected by any other things. By having the formula of EX, we can estimate the impact of ED on EX which is missing from the regression formula.

Solving the equation, we can find that Alice’s salary would be $76,000 if she had a college degree.