The Saga of PLS


by Gaston Sanchez

The Philosopher of Science

During the 1950s and 1960s Wold’s research interests broadened from econometrics to other non-experimental analysis in general, and to the area of philosophy of science. Throughout the 1950s he dedicated a considerable amount of time to discuss notions of model building, and in particular, a lot of his work spinned around the prickly notion of causality. These topics (causality and model-building) were among Herman Wold’s favorite subjects of study. Everything that had to do with theorizing about causation and models deserved his attention, especially as viewed from the more abstract and intellectual perspective of Philosophy of Science. In particular, Wold formed part of the main trend among econometricians that were keenly interested on making econometrics as scientific as possible. To achieve this, it was believed that researchers needed to be able to apply a scientific method of proposing hypotheses and test them.

In Wold’s eyes, based on his extensive hands-on experience, research in economics as well as in other social sciences had “two heavy handicaps.” One of the handicaps had to do with working “without the guidance and support of controlled experiment, the supreme tool of natural sciences,” Wold believed. This was something not only attributable to economics but it was common to other social sciences. The second major handicap had to do with the type of data “notoriously unreliable” for the major part, and sometimes even “scarce or completely lacking.” Suffering from those drawbacks “it is no wonder,” Wold concluded, “that quantitative economic research displays little of the rigour and precision attained in many natural sciences.”

If there was something that characterized the type of data Herman was working on was the fact of being observational data or nonexperimental data. His preferred tool of analysis was without a doubt the method of Least Squares, which he believed was better suited to deal with nonexperimental settings. At the time, the method of Maximum Likelihood (ML) was becoming extremely popular in most scientific fields. Introduced by Ronald Aymler Fisher in the early 1920s, ML provided an elegant statistical mechanism for theory testing. Emerged from agricultural and biological applications where experiments were the rule rather than the exception, ML proved to be highly valuable. Imported by Trygve Haavelmo to the Econometrics arena, ML had also been welcomed and given a primordial place of preference over other estimation methods such as Least Squares.

Maximum Likelihood, as attractive and convenient as it can be, requires the researcher to know the distribution of the analyzed data, and observations to be independent—assumptions that in practice may be impossible to have. For Herman Wold, the pay to price for using Maximum Likelihood on nonexperimental socio-economic data was unacceptable. It was not that he rejected the method of ML per se, but the assumptions on which ML was required to be applied made Wold had trouble accepting it. He argued that socioeconomic observational data didn’t meet the ML assumptions. In economic analysis, it was not possible to make experiments and manipulate variables as in a laboratory or biological related setting.

Thinking about Causality

If it wasn’t enough with the handicaps of lacking experiments and quality of data, a “more serious” issue, Wold observed, was the “research attitude” adapted in economics. “In economics,” Wold wrote, “these checks on the scientific conscience are weaker.” Consequently, “the average quality of the research is lower.” All this situation was “paradoxical” to Herman eyes. Given all the “big arsenal of efficient methods that modern statistics has placed at the research work’s disposal,” how was it possible to have so many results of poor quality? His main conclusion was that the statistical methods—such as Maximum-Likelihood—had “been devised primarily for the treatmeant of experimental data,” not for observational data as all of economics data was. Wold pointed out to analysis of variance and the maximum likelihood method, saying that they were “handicapped when applied to economic statistics or other non-experimental data.” The solution, instead, was in using more traditional methods “which have sometimes been declared obsolete,” Wold wrote, “like the least squares regression.” In Wold’s opinion, “these methods are essentially sound.”

Herman Wold was deeply concerned on identifying causality relations which could be used for predictions and forecasts. He didn’t like the idea of finding “laws” which he thought it was hard to do in economics given the fact that one could not do experiments. But he believed in the possibility to discern and determine which variables were the cause for a certain target or response variable. The problem was that, at the time, nobody was willing to talk about relations in causal terms, something that Wold found deplorable. For example, there was—according to Wold—“a deliberate disregard of causal interpretation,” in the way simultaneous equations were being used. Moreover, the terminology used in simultaneous equation models “avoids the concept of causality.” Such an evasion was pretty much inconceivable to Wold’s position, so much that it was for him “a break with scientific tradition by and large.” “The concept of causality”, Wold wrote, “is indispensable and fundamental to all sciences.” Terms that Wold considered “with a causal content” were influence, dependence, effect, stimulus-response, active substance. Terms like functional relation or a predictability hidden a causal meaning in Wold’s opinion.

“if scientific analysis were stripped of all terms with a causal content, nothing would remain but description and formalism.”

“There was the question of causal interpretation of OLS regressions”, Wold recalled, although “causal arguments were still practically taboo in the socioneconomic and behavioral sciences.” At that time, the notion of causality was a hot topic in social sciences where causal analysis and causal models had attracted a lot of interest. Simply put, Herman Wold formed part of those thinkers that considered causality in the sense of discerning among a set of variables, which ones should be considered to be causal. To Wold, knowing which variable causes which within a system of equations was of paramount importance.

However, reading Wold’s papers about causality and model building, it is not clear what was the exact definition that he gave to the notion of causality. A good example about this issue is Wold’s article “Causality and Econometrics”, appeared in Econometrica in 1954. This is one of the most enygmatic pieces of work, with a dense writing style, rhetorical questions, murky structure, and heavy philosophical content. Doubtless, the most philosophical of all his publications. Despite the title, the article is not much about econometrics but about philosophy of science. In it, Wold attempts to show his stance around the debate and controversy about causality in econometrics. The structure of the paper can be divided in three acts.

Act I is about the philosophical discussion of causality. Above all, it is a response to Bertrand Russell’s seminal paper: On the Notion of Cause. It is also the opportunity for Wold to reveal his philosophical preference: “Logial Empiricism.” For Wold, it is necessary (almost mandatory) to use the term “causality” related to terms in scientific models. For him, just talking about “functional relation” or “predictability” is not enough.

Even though Wold acknowledges that “the first requisite for a fruitful discussion about causality is an adequate definition of the concept,” he does not provide such definition. “What then is to be understood by causality?” Herman asks. But he’s not very precise in his answer. Instead, he talks about “causal relationship,” viewed from both perspectives: controlled experiments—as in natural sciences—, and nonexperimental cases as in social science, and econometrics in specific.

In Act II Wold presents his famous and recurrent duality which he illustrates in the form of a contingency diagram. On one side there is the description and explanation. On the other side there is the non-experimental and the exeprimental aspects. Finally, Act III touches on causal concepts in econometric models. More specifically, he delves into his recursive models estimated by OLS, ending in causal chains. One of the few occasions he mentions the word “structural” appears in page 176. The structural term as used in structural models in the econometrics literature, is not the same that Wold uses. For him, “the causal interpretation of the relationships is entirely different.”

“It is fundamental,” Wold said, to separate “empirical observations” (or facts) from “speculative thoughts” (or theory). He recognized that the “weakest point in the philosophical discussion” was an adequate definition of the causality concept. For Wold most definitions were either “not defined at all” or simply “too narrow.” “What then is to be understood by causality?” Wold asked. The answer he provided was that of “a causal relation in the case of a controlled experiment.” In the case of nonexperimental observations—as in economics—a relationship is causal “if it is theoretically permissible to regard the variables as involved in a fictive controlled experiment with for cause variables and for effect variable. “In the experimental case the typical situation is one in which the causal relation enters as a hypothesis to be tested or demonstrated.” Wold stressed that “the test of the causal hypothesis is more direct and indisputable in the experimental case.”

Perhaps one of the best explanations about Wold’s stance, can be found in American sociologist Hubert Blalock’s 1991 paper “Are There Really Any Constructive Alternatives to Causal Modeling?” (pp 327). Blalock writes:

“The handful of us who introduced causal modeling, path analysis, and structural equation modeling into the sociological literature during the 1960s encountered a problem similar to that faced by Wold (personal communication) a decade or so earlier. At the time, regression analysis was being used atheoretically and causation was a dirty word.”

In the same paper:

“our attention at the time was devoted to developing the rationale for employing statistical techniques along with supplementary assumptions as an aid for assessing the fit between empirical data and predictions made from causal models about covariances and temporal sequences.”

As Blalock admits “Much of the flavor of that earlier literature has now been lost from view.”

An Intellectual Turmoil

While it is true that Wold was interested in various topics, most of his personal research was focused on the problem of simultaenous equations. The lack of a definite solution based on Ordinary Least Squares was accumulating a ticking bomb in his mind. As the decade of the 1950s approached its end, Herman Wold was having more and more difficulties handling all the prickly issues associated with causation and model-building.

A few months before turning fifty years old, Herman Wold packed his suitcases and flew to the United States, taking some time off from his Chair position of the Statistics Institute at Uppsala University. He had gladly accepted the invitation to spend some months between 1958 and 1959 as a Visiting Professor at Columbia University where he was expecting to make some progress around the econometric systems of multirelational equations. This topic was in reality the outer layer of a deeper problem, something much less mathematical and much more philosophical: the causal interpretation of statistical models. After a decade of vigorous attemps to make a point among the rest of the econometrics community, the harder he pushed on the causality issues, the more isolated he found himself. Trapped in the intellectual maze of his philosophical thoughts and his statistical modeling approaches, Herman hoped switching the Scandinavian airs for the estimulating New York atmosphere would help him find a solution.

Somewhere during his visiting period at Columbia University, Wold found himself lost in the middle of his philosophical arguments and his econometric multirelational systems. “An intellectual turmoil,” Wold wrote, “built up within me.” Unable to include a satisfactory causal specification in his models, he finally “gave up.”

How does all this relate to PLS? Well, the period of time at the end of the 1950s is crucial for the posterior development of PLS. Wold had already proved that Least Squares could be applied for recursive models. His big obstacle, however, was the estimation of non-recursive systems, or what Wold used to called: interdependent models. This was one of the biggest challenges he had to deal with in his research career, at least during the 1950s. In his quest to find an OLS solution for non-recursive systems, but also in his journey across the causal labyrinth, all the roads he explored were dead-end roads. One attempt after another, Wold was not able to solve the non-recursive problem and he finally ended up in a blind alley.

A New Start

At age 50 and after a decade of work Herman had still not found the dream solution for the interdependent systems. He was proud of his salvage of OLS applied to causal chains; that is something that kept him safe, allowing him to keep floating, and preventing his sinking. But exhausted after many years of wandering with no clear direction, he returned to the shore. Pondering between his philosophical causal notions and his statistical reasoning, Herman “decided to push the causal arguments on causal chains and interdependent systems aside, and make a new start.”

If something was clear to Wold among all of his intellectual turmoil, it was his blind faith on Least Squares, which he was not willing to abandon. Wold agreed to push the causal arguments on causal chains and interdependent systems aside, making a radical decision: give up all what he had done and start all over again from a new conceptual and philosophical perspective. Wold made a new start on the basis of what he denominated predictor specification. This new approach in turn led very soon to the Fix-Point method and eventually to Partial Least Squares.

For his new start, Herman found the solution in a couple of convenient assumptions: the conditional expection in regression analysis for OLS. This offered him a way to tame the causality whirlwind and encapsulate it in a statistical concept that would let him speak in terms of prediction while, most important, kept using OLS for estimation purposes.

Roughly speaking, if we assume a functional relation between a response variable and a set of predictors :

the conditional expectation:

implies that the the term is uncorrelated with the variables. If we take a step further an assume a linear function:

then:

which is the well known regression expression. Under mild assumptions, the regression coefficients can then be estimated by OLS.

Today we don’t even stop to think about its philosophical causal implications. However, to someone as deeply concerned with causation and model-building as Wold, this was not something of minor importance. He first referred to that relation as unbiased predictor (Wold, 1963), then he changed it to eo ipso predictor, and finally he ended up calling it predictor specification. To Wold, this conditional expectation provided the “general rationale for LS specification and LS estimation.” Predictor Specification, or Presp as he used to called it, “marks the comeback of least squares”—Wold declared.

What’s behind this so-called predictor specification? Basically, Wold decided to leave causality aside and asked himself what could be achieved with a predictive relation. The joint multivariate distribution assumptions typically used in Maximum Likelihood were also to be left aside. The only thing that Wold wanted to take into account was the systematic part of a predictive relation expressed as the corresponding conditional expectation. For most of us this might seem trivial but for Wold it was the theoretical mechanism that allowed him to seize Least Squares and had a convincing epistemologycal argument that justified the use of OLS. In this way, by requiring just mild supplementary assumptions, Wold found a scape route to use OLS regression for estimating coefficients that are consistent in the large-sample sense. Moreover, he extended the notion of predictor specification to models with multiple relations of blocks of variables, which in turn would lead to PLS.Next chapter