Hypotheses and Corroboration and Data Variation I May 4, 2011Posted by larry in data, Duesenberry, economics, interpretation, Lakatos, Logic, nature of science, Statistics, Suppes.
Duesenberry has an excellent discussion about the relationship between a theory or hypothesis and a test of that theory or hypothesis. He correctly notes that one can never prove a hypothesis or theory but fails to give a reason why this should be so. He also does not mention the Duhem-Quine problem in the testing of hypotheses.
To simplify the discussion, I will consider the testing of a single hypothesis, but what I say applies to theories as well. The reason that a given hypothesis, H, can not be proved, or verified, is for logical reasons. Most scientific hypotheses are in the form of universal generalizations. For instance, for all x, Ax implies Bx. Now, in order to prove or verify that all As are Bs, one would have to be able to inspect, in principle, every thing that is an A and/or a B, past, present and future. This is impossible. Hence, general laws that are in the form of universal generalizations can never be verified. But they can be falsified. Again, for logical reasons. All you need to falsify the hypothesis that for all x, Ax implies Bx, that is, that all As are Bs, is to find an A that is not a B. A simple example of this is the eponymous generalization that was once believed, that all swans are white, that is, that for all x, if x is a swan, then x is white. To falsify this, you need to find one black swan, that is, one thing x that is a swan and is not white. Not only is this possible in principle, such swans were discovered in Australia. The major difference between a hypothesis H and a theory T is this – that a theory can be seen as a conjunction of related hypotheses. Therefore, a hypothesis H can be viewed as a smallest theory.
There is, therefore, an asymmetry between verification and falsification – universal scientific generalizations (scientific laws) can not be verified though they can be falsified. On the other hand, existential generalizations, of the form there is an x such that Ax, i.e., there is at least one swan, can be verified but not falsified. It is possible to show that there is a swan by finding one, but impossible, for logical reasons similar to those above, to prove or verify that there are none on the basis that one has yet to be found. The situation is even more complicated than I have described, involving other factors, but that is for another time. (But it is recommended that the works of Imre Lakatos, such as the methodology of scientific research programs, be consulted. His conceptual scheme is non-trivial and more than just interesting. And Patrick Suppes’ article on models of data (http://suppes-corpus.stanford.edu/articles/mpm/41.pdf).) The important lesson to take away from this is that for a hypothesis to be scientific, it must be falsifiable in principle, although adhering to this requirement involves considerable complexity and is not without its difficulties.
The Duhem-Quine problem is more complicated. This is known as the method of saving the hypothesis. According to the Duhem-Quine principle, it is always possible to save a hypothesis from falsification or refutation. This is due to the logical nature of the testing process. To show this, a little technical detail is required. When a hypothesis is tested, the conditions in which the test is conducted such as the experimental or field conditions, the assumptions of the influence of the observer, and the like are also under test. The experimental apparatus may be wrong, or the investigator may be unconsciously influencing the experiment or observation, or the test apparatus may be faulty. This list can be extended ad infnitum, but for all practical purposes it is inevitably finite and small. The logic of the situation is this. Suppose you have a hypothesis H and from it you can deduce a proposition concerning an event E. In the testing scenario illustrated above, you assume H to be true and look to see whether E is true or not. If you find E, while you have not proved or verified that H is the case, you have, as Popper would have said, corroborated E. That is, you have made the truth of E appear more likely.
Now, let us suppose that on the assumption that H is the case, you fail to observe E. One can infer from this that H or something else being assumed is not the case. The assumptions consist of H & C &B & Q, where C denotes the experimental or observational conditions, B the influence or bias of the observer, and Q any additional factors that might be influencing the outcome of the test. So, if E is not observed, instead of falsifying H, you can save the hypothesis by rejecting C or B or Q. You can then claim that E really does follow from H; it is just that this test failed to substantiate this particular outcome because it was flawed.
As Duesenberry discusses, there is another factor involved in the testing of a hypothesis. And this is that even should you succeed in corroborating H, all you have shown is that for the data at hand and under the conditions of the test, H seems to explain the data better than a set of alternatives, not that it is true simpliciter. This state of affairs can, however, change and another hypothesis can take the place of H as the favored one. This process of replacement can be highly contentious.
As Duesenberry himself notes, the data available to economists is often not very good. Not only that but the variation inherent in such data remains unanalyzed more often than not. Economists often present data in the absence of error coefficients and the like. They also do not conduct statistical hypothesis tests of data even when it is not obvious, from ‘eye-balling’ the data, that H0 explains the data better than some alternative from a set of alternative hypotheses, H1, …, Hn, under the conditions of such an informal test scenario. They appear to assume that the data ‘speak for themselves’, which they do not. Data, to make sense, must be interpreted and that means placing the data in an interpretive context, that is, a theoretical context. Otherwise, there is no difference between a set of data and a list of numbers or names in a phone book. In saying this, I am not arguing that statistical hypothesis testing is essential, only that it is not carried out even when it would appear to be helpful. Irrespective of this, data should never be presented in the absence of error coefficients, unless the data differences obviously swamp any inevitable errors the data set may contain. But how often is this going to be the case?
I must mention that this is not always the case in the present nor in the past. Duesenberry (1949) himself cites references with statistical content – notably Keynes’ ‘A Statistical Testing of Business Cycle Theories’ (1939), Trygve Haavelmo’s’ The Probability Approach in Econometrics’ (1944), and G. Udny Yules’ ‘Why Do We Sometimes Get Nonsense Correlations’ (1926), along with eminent social psychologists such as Abram Kardner and Leon Festinger, the latter of whom’s Theory of Cognitive Dissonance has influenced Akerlof, the psychoanalyst Karen Horney (The Neurotic Personality of Our Time, 1937), and the social scientist Thorsten Veblen (Theory of the Leisure Class, 1934). There is no reference to Talcott Parsons, who was probably one of the most famous Harvard sociologists (in the Department of Social Relations) with an economic background at the time of the publication of Duesenberry’s Income, Saving and the Theory of Consumer Behavior (1949). It may be that, although both were at Harvard at this time, Duesenberry felt that Parsons’ approach, which was rather idiosyncratic, was rather tangential to his own. I will come back to this issue regarding the different and possibly not easily reconcilable approaches of sociologists, anthropologists and economists to the fields of economics and political economy.