spacer spacer spacer meaningless logo

The reason why we would care about about the long-run probability of the null is not that we're interested in 'establishing causality' per se. If there is a link between two variables in the actual population, then there is a causal link between them. There has to be some reason why the variables take on the values that they do.

However, researchers tend to distinguish between causal and spurious relationships, or relationships that are 'due to chance' (whatever that means) and relationships that are not. There is for example a relationship between pool drownings and films Nicolas Cage has appeared in from 1999-2009. Researchers call relationships like these non-causal. But of course there is something that caused this relationship. What researchers seen to mean when they say there is no causal relationship between two processes is that there is no causal relationship specified by the theory or model the researcher is testing. When they say that an effect isn't causal (like the case of drowning and Nicolas Cage) what they're really saying is that they don't know of any interesting theory that would explain the relationship.

The statement that there is no interesting explanation for an observed relationship is equivalent to the statement that the observed relationship would all-else-equal not obtain in the 'long-run.' It

[[[Very good. I don't like the phrase "due to chance" because it has a particular kind of vagueness that people use to smuggle in conclusions that aren't warranted BUT I DO like your way of explicating it as "some relationship not covered by our model", and I think it's sometimes reasonable to extend that to "some relationship that we reasonably believe isn't interesting". Jason]]]

It might be the case that certain facts generally obtain that all-else-equal cause one process to influence another in a way specified by an interesting model. By 'generally obtain' I mean 'obtain in most possible populations that are relevantly similar to the actual one,' or 'obtain in most populations contained in the superpopulation. Suppose, for example, that a political scientist wanted to test the theory that people generally (in other similar populations) voted for parties that advanced their narrower material interests. Suppose also that said political scientists managed to collect data which suggested that higher-income voters in Australia (or other hypothetical 'Australias') all-else-equal voted for the Coalition. The data would support the theory that people vote for self-interested reasons all-else-equal.

However, all else is not equal. First of all, there may be other facts which generally obtain that the theory we are testing does not specify (for example, racial differences between high- and low-income earners). The way to solve this problem is to ensure that the model we're using to generate the relevant estimates is fully specified or that the experiment we're conducting controls for as many irrelevant factors as praticable.

[[Seriously? That's way too hard, isn't it? Maybe you mean some sort of approximation to that?]]

Second, there may be certain facts about the actual population which bring about the link between the variables we're interested in but which do not generally obtain in other populations where the variables take on the values that they do.

[Good. I like this point a lot. Might want to add something about our reasonably guessing whn that's the case.]

Returning to the Nic Cage example, it might be that Nic Cage happened to appear in more movies in hotter years where people spent more time swimming in pools. The thought is that weather patterns could just as easily have been cooler without Nic Cage's career taking a nosedive. In other words, you are just as likely to pull a population from the superpopulation in which there is a correlation between Nic Cage appearances and drowning as you are to pull a population where there is not. In the long-run, then, the effects will 'cancel out' and the null will be true.

I suggest you rephrase the points above where I had trouble with them, and then keep all this, and add a summary at the end.