The Saga of PLS

by Gaston Sanchez

The Chemometrician

Meanwhile in the late 1960s and early 1970s, another type of metrics discipline was experiencing revolutionanzing changes in Scandinavian countries and Northern Europe. In this case, the field was not part of the socio-economic sciences scene but part of the life and natural sciences, concretely Chemistry.

With the arrival of computers during the 1960s, and the increasing appearance of new electronic measurement devices, the field of Chemistry was irremediably touched by these technological changes. Consequently, a transition from “wet” chemistry to a more instrument based “electronic” chemistry gradually took place. One of the effects of having new electronic devices was the production of more data than what was commonly available before. So with the new instruments also came the need for analytical tools to crunch the raising tide of emerging data.

Traditional mathematical and statistical methods soon proved to be not enough for the new types of data. One of the main characteristics was the presence of many more variables than observations. Among the leading figures recognizing this change, and taking advantage of it, was Herman’s son Svante Wold. Named after his grandfather Svante Arrhenius, the famous Swedish chemist—founder of electrochemistry, and Nobel Prize for Chemistry in 1903—Svante Wold also decided to follow Arrhenius’s steps by studying Chemistry. He was also interested in computers, and he was a well versed programmer, sometimes helping his father implement various of his least squares-based algorithms.

The Born of Chemometrics

In 1971, Svante Wold, as a young professor at Umea University, Sweden, invented the word chemometrics for a grant application. “Chemometrics, the art of extracting chemically relevant information from data produced in chemical experiments.” Svante, however, was not the only one concerned about extracting relevant information from chemical data. In the early 1970s, a couple of publications about pattern recognition of chemical data by Bruce Kowalski and Bender appeared in the Journal of the American Chemistal Society. “I read and reread and reread the two articles on pattern recognition”, Svante recalled. To him, those publications were more than simple papers, “they were a revelation.” Svante quickly realized something equally important: “suddenly, I was not alone in my feelings about the state of chemistry.” The same ideas of analyzing multidimensional chemical data, through the application of statistics and applied math, via computers, was shared by other chemists in the west coast of the USA. Svante “was thrilled” by all this situation.

Invited by the renowned American statistician George Box and Bill Hunter, Svante spent the academic year of 1973-1974 at the Statistics Department of the University of Wisconsin in Madison, Wisconsin. In mid-October 1973 Box asked Svante to go to the University of Arizona, in Tucson, for an Office of Naval Research symposium on chemistry and computers. Coincidently, in that meeting, the so much awaited moment by Svante became true. He finally had the opportunity to met the young chemist professor Bruce Kowalski, whose papers had deeply impacted him so much.

At the end of the symposium, Svante had the chance to talk to Bruce about their common interests. One of Bruce’s initial questions to Wold was: “What are you doing?,” to which Svante answered “chemometrics.” Surprised by Wold’s answer, Kowalski continued: “What is that?” Svante patiently explained the meaning of the term, asserting that it was basically the same thing Bruce was doing, but “much less advanced.” The word “chemometrics” was precisely the term Kowalski was looking for. Positively impressed by the young Swedish chemometrician, Bruce invited Wold the following year (1974) to Seattle for a month to discuss in more detail what they were doing.

Between May and June 1974, Svante went to visit Bruce in Seattle. There, Wold had the opportunity to show his–at the time new—pattern recognition methodology “soft independent modeling of class analogy”, most commonly known as SIMCA. Given the good performance of the method, it was decided to include it in ARTHUR, the package that Bruce and his research group were developing during that period.

Few disciplines can claim the exact day they were born. And Chemometrics is one of them. Just before Svante’s returning to Umea, Svante Wold and the group of Kowalski gathered on the evening of June 10, 1974, in a small Tex-Mex restaurant in Seattle. “After a number of Tequila shots”, Wold wrote, “we decided to form The Chemometrics Society” (which soon became “The International Chemometrics Society”).

The Born of PLS Regression

The early interest of chemometrics were mainly about pattern recognition and classification. Borrowing techniques from statistics (like discriminant analysis), and other computational approaches for pattern recognition, the new chemometricians soon discovered that such approaches were limited for the type they had to analyze. As Svante recognizes, “these approaches were still handicapped by the dogma that data had to have substantially more observations than variables.” This forced analysts to devise their own dimension reduction approaches and new methods of pattern recognition. Then, more predictive problems were added to the table.

By the late 1970s, as the PLS framework of Herman became more mature, Svante began to show some interest in the idea of latent variables, and the opportunities they seemed to open for the analysis of high dimensional chemical data. “The concept of latent variable,” Svante said, “appealed to me as it is very similar to the ‘effects’ we have in organic chemistry.” It didn’t take long for him to be “greatly attracted by the PLS philosophy” of his father. The problem, however, was on the implementation side which Svante recognized “impressed me less.” Svante’s concern had to do with an over-fitting tendency of his father’s method, mainly due to the large number of treated variables. Initially, he did not take PLS “very seriously.” Thanks to his father’s “enthusiasm and patient explanations of how latent variables stabilized the situation,” the reluctance shown by Svante slowly went down until he finally became convinced that the PLS methodology “was a well working approach with great potential.”

Having grasped the basics of Herman’s PLS framework, Svante “started to work with the simplest PLS model (two-blocks) in the beginning of the 1980s. To his surprise, he realized that the model “could be looked on as a regression model in latent variables with a simple geometrical interpretation.” This discovery made Svante “enormously happy,” and he immediately decided to carried out a series of simulations to see “what happened with many variables.” The results confirmed that many descriptor (explanatory) variables could be handled with no or minimal over-fit (provided cross-validation was used to check the number of components in the models). This confirmation showed him “that Herman was correct.”

About 1979 or 1980, Svante met Norwegian chemist Harald Martens in Oslo. At the time, Harald was working with predictive models, using Principal Components Regression (PCR)—which not always provided good results. After some exchange of ideas, Svante “managed to convince Martens that PLS would be much better for his problems.” Together, they “started to apply Herman’s 2-block PLS,” Wold wrote. But they didn’t achieve much. Once again, the first results were not what they expected. “We had a big scientific crisis,” Svante declared. Something was not working, but what? Svante and Harald shared their procedures, and they worked for days to find out the cause of the problem. “Harald and I talked on the phone for hours everyday between our programming.” The main issue was related to the performance of the extracted PLS components: their models did not work beyond the first component. Cleverly, Harald realized that the problem had to do with the way the latent variables were obtained. Step by step they figured things out and “discovered the fantastic properties of the 2-block PLS approach.” They had to make a couple of adjustments to Herman’s algorithm for the method to work, but it proved very successful and promising. “We had a wonderful time,” Wold remembered. “Those years between 1981 and 1984,”, he said, “were among the happiest—scientifically—in our professional lives.”

Svante presented a first version of the new PLS regression method at a conference on Data Analysis in Food Research arranged outside Oslo in 1982 by Harald and his boss Russwurm. Formalizing the procudere, Svante, Harald, and Herman proposed the new technique for regression analysis based on Partial Least Squares in 1983. They published the first papers on PLS and multivariate calibration in 1983, and the analytical chemistry community was informed soon after. Improved and modified during the middle 1980s, slightly different versions of the PLS regression were developed by Svante, Harald and other colleagues. The PLS regression framework started to revealed as very practical, useful and valuable. In an industry where there was a need for analytical tools capable of dealing with multicollinearity, missing values, and large data sets, PLS Regression fulfilled all those needs.

Like Herman Wold in econometrics, his son Svante Wold became also a pioneer and leading figure in his field of expertise: chemometrics. Unlike his father, Svante’s trajectory would take a different path. Svante Wold and colleagues adapted and reshaped Herman’s ideas. Stripping away the more theoretical issues, they focused more on the practical issues, and computational aspects. Their motivation was more pragmatic and driven by the type of data challenges they were dealing with. While Herman Wold’s developments were always framed within an econometrics model-building tradition, and looked at through the philosophy science glass, the behavior of Svante and his collaborators was more industry influenced. There was not much room for philosophical considerations and causality considerations.

The Eclipse of Herman’s PLS

At the time of Herman’s retirement—which were also his last years—the body of knowledge about Partial Least Squares had grown considerably. By 1989, three years before his passing away, Herman had produced around 20 publications, as well as a couple of dozens reports and working papers about his PLS methods. Also in 1989 the book Latent Variable Path Modeling with Partial Least Squares by Jan-Bernd Lohmöller was published, and the software LVPLS was available. PLS, in its Path Modeling version, was already ripen, and it seemed to be a matter of time before taking off.

PLS did take off, but not the father’s version, but the son’s regression framework. At age 83, Herman passed away on February 16th, 1992 in Stockholm. He had left a tremendous legacy, and he had already passed the torch to various of his students and collaborators. Unfortunately, the most promising figure, Lohmöller, also died in the early 1990s. Orphan and abondoned to its own luck the so-called “PLS soft modeling on path models with latent variables” started a slow but constant decline. There was nothing that stopped PLS Regression from eclipsing his older sibling PLS Path Modeling.

By the early 1990s, PLS Regression was on the summit. Svante Wold was promoting it within Chemical industries all over the world. Not only that, Tormod Naes and Harold Martens were also among the prolific authors, proposing adaptations and expanding the method to new directions. Software was not an issue, there was SIMCA-P (now Umetrics) and Unscrambler. SAS would later add its PLS-R procedure. Consequently, it soon attracted the attention of other experts in the field, expanding to oil industry, food science, cosmetics, and other related subfields of Chemistry.

In 1992, French practitioners and researchers started to have their first contact with the PLS regression methods. Among the consulting activities of Svante, he was difussing his methods and software around the European Chemical Industry. In France, he consulted for L’Oreal introducing his tools and software. Before long other companies were catching up with this method. One of those companies was Rhone-Poulenc. At that time, the head of the data analysis research department, Jean-Pierre Gauchy, asked statistician and professor Michel Tenenhaus to explain the PLS methods and train the company’s researchers. Michel, a french professor in business school and also a consultant for industry, accepted the request of Jean-Pierre and started to study the regression via PLS. Nothing would prepare Michel for the coming revolution of the following years.

Next chapter