Logical Positivism & The Scientific Method in Genetic Algorithmics
The genetic algorithm owes its form to biomimicry, not derivation from first principles. So, unlike the workings of conventional optimization algorithms, which are typically apparent from the underlying mathematical derivations, the workings of the genetic algorithm require elucidation. Attempts to explain how genetic algorithms work can be divided in two: those based to a lesser or greater extent on the scientific method, and those that reject the scientific method in favor of logical positivism.
Positivism is the idea that the only true knowledge is knowledge that can be verified. Logical positivism is the idea that verification can be carried out through the application of formal logic to empirical observations. Logical Positivism originated in the first quarter of the twentieth century and was explored and popularized by the Vienna Circle, a group of philosophers, scientists, and mathematicians that met regularly from 1924 to 1936 at the University of Vienna. Buoyed by rapid advances in the axiomatization of mathematics, logical positivists believed that like mathematics, all of science could be derived within a formal axiomatic system where the list of axioms consist of 1) empirical observations, a.k.a. protocol sentences and 2) the axioms of mathematics (e.g. ZFC). The movement to recast science in this mould gained currency in the first quarter of the twentieth century but subsequently came to be regarded as untenable. Writing in 1971, Werner Heisenberg offered this reflection about positivism in general:
The positivists have a simple solution: the world must be divided into that which we can say clearly and the rest, which we had better pass over in silence. But can any one conceive of a more pointless philosophy, seeing that what we can say clearly amounts to next to nothing? If we omitted all that is unclear we would probably be left with completely uninteresting and trivial tautologies. (Heisenberg 1971)
The non-universality of empirical observation was a particularly vexing problem for logical positivism. Given the verificationist foundations of this movement, only empirical observations made at a particular point in time and space were admissible as protocol sentences. Generalization to other points in time and space was not allowed. This made it hard to prove much of value. Proponents of logical positivism tied themselves in knots trying to address this and other critiques. By the mid twentieth century, the logical positivist movement had largely crumbled and the philosophy of science propounded by Karl Popper had gained traction.
While the universality of scientific theories makes logical verification (i.e. proof) difficult, if not impossible, that very universality, observed Popper (2007a, 2007b), makes disproof (i.e. falsification) a much easier path to pursue. All it takes is a single counterexample — that is, a single prediction not borne out — to disprove a theory as it currently stands. Of course a theory must lend itself to falsification in the first place, and Popper cautioned strongly against theories that employed what he called immunizing strategems to avoid falsification. Such theories (Freud’s theory of psychoanalysis for example)
…appeared to be able to explain practically everything that happened within the fields to which they referred. The study of any of them seemed to have the effect of an intellectual conversion or revelation, opening your eyes to a new truth hidden from those not yet initiated. Once your eyes were thus opened you saw confirming instances everywhere: the world was full of verifications of the theory. Whatever happened always confirmed it. Thus its truth appeared manifest and unbelievers were clearly people who did not want to see the manifest truth. (Popper, 2007a, p45)
Scientific theories, in contrast, are theories that take risks. By predicting weird phenomena (e.g. gravitational lensing in the case of Einstein’s theory of general relativity) they leave themselves open to refutation. “A theory which is not refutable by any conceivable event is non-scientific”, wrote Popper. “Irrefutability is not a virtue (as people often think) but a vice” (Popper, 2007a, p46).
Whether day-to-day science, with its tribalism and politicking, actually works as Popper described has been rightly disputed (Kuhn 2012, Latour 1987). However, it is hard to argue with the historical success of the scientific method, or with Popper’s simple and elegant use of contraposition to frame the normative practice of science:
Here is Popper in his own words:
Like many other philosophers, I am at times inclined to classify philosophers as belonging to two main groups — those with whom I disagree, and those who agree with me. I might call them the verificationist or the justificationist philosophers of knowledge or of belief, and the falsificationists or falibilists or critical philosophers of conjectural knowledge …
The members of the first group — the verificationists or justificationists — hold, roughly speaking, that whatever cannot be supported by positive reasons is unworthy of being believed, or even of being taken into serious consideration.
On the other hand, the members of the second group — the falsificationists or falibilists — say, roughly speaking, that what cannot (at present) in principle be overthrown by criticism is (at present) unworthy of being seriously considered; while what can in principle be so overthrown and yet resists all our critical efforts to do so may quite possibly be false, but is at any rate not unworthy of being seriously considered and perhaps even of being believed — though only tentatively. (Popper 2007a)
Popper famously claimed to have “killed positivism”. Were he alive today, he might be intrigued by the resurrection of positivism in at least one corner of Computer Science. Consider the following passage in Vose’s book, The Simple Genetic Algorithm:
My central purpose in writing this book is to provide an introduction to what is known about the theory of the Simple Genetic Algorithm. The rigor of mathematics is employed so as not to inadvertently repeat myths or recount folklore.
The decision to accept assertions only when proved is not without price. Few (certainly not I) have mastered enough mathematics to make its application trivial. Another cost is that one is forced to trade what is believed to be true for what can be verified. Consequently this account is silent on many questions of interest for lack of proof. (Vose 1999)
The “myths” and “folklore” mentioned above reference the Building Block Hypothesis, the idea that genetic algorithms work through the hierarchical composition of short chromosomal snippets that confer above average fitness. The inventor of the genetic algorithm, John Holland, hypothesized that this type of hierarchical composition was the reason for the genetic algorithm’s effectiveness on a range of combinatorial optimization problems. Unfortunately, the building block hypothesis and other hierarchical composition hypotheses (Watson 2006) have been propped up long after it has become clear that such hypotheses cannot be reconciled with a long list of theoretical and empirical anomalies.
The obstinacy of the hierarchical compositionalists when confronted with these anomalies resulted in an interesting transition within the theoretical evolutionary computation community. Hypotheses on the whole came to be regarded with wariness and there was a general hardening of resolve to, as Vose put it, “accept assertions only when proved.” For well over a decade now, foundational research on evolutionary computation has had a decidedly positivist bent. The logical positivism practiced by evolutionary computation theorists is a purer version of the logical positivism espoused by the Vienna Circle in that evolutionary computation theorists find no use for empirical observations, and rely exclusively on the axioms of mathematics. The body of work that goes by the name Runtime Analysis is but the latest in a line of positivist evolutionary computation literature stretching back to the 1990s.
Now perhaps this is just as well. Perhaps logical positivism was simply a movement in search of a niche, and perhaps the foundation of evolutionary computation is just that niche. Maybe. But frankly, there is no a priori reason why any of this should be so. I’ll grant that it is possible to verifiably explain the workings of a large number of computer algorithms — for example, the workings of the algorithms in Cormen et al’s classic, Introduction to Algorithms — however, these algorithms were constructed to have verifiable explanations. Explanatory verifiability, in other words, was a design consideration, not a coincidence. The same cannot be said for the genetic algorithm, which, as I said earlier, owes its form to biomimicry.
The fact of the matter is logical positivist approaches have not yielded answers to the kinds of questions that users of evolutionary algorithms care about. Rather, they have diverted research on the foundations of evolutionary computation toward the study of highly simplified — neutered, one might say — algorithms, for example, the (1+1)-EA, and trivial fitness functions, for example, onemax. In 1999, Vose wrote that his positivist account is “silent on many questions of interest for lack of proof.” Wikipedia’s embarrassing entry on Genetic Algorithms suggests that the situation today is no different.
One might ask why the scientific method — an approach that has acquitted itself splendidly in field after field — was so readily cast aside in favor of logical positivism? I think it’s because evolutionary computation being a subdiscipline of Computer Science, and Computer Science, contrary to its name, not really being a science. By this I mean that unlike the foundations of say Physics, Chemistry, or Biology, the foundations of Computer Science are purely mathematical — hypotheses play no part. So, while Computer Science has seen engineering revolutions aplenty, it has witnessed nothing similar to the transition in Physics from a Newtonian universe to an Einsteinian one, or the move in Chemistry from the phlogiston theory of combustion to Lavoisier’s oxygen based theory, or any of the other paradigm shifts described in Kuhn’s Structure of Scientific Revolutions (2012). Mathematics, of course, figures prominently in the foundations of the natural sciences, however it is always in support or within the rubric of some hypothesis. Theoretical Physicists, Chemists, and Biologists trained informally, if not formally, in the ways of the scientific method know how to evaluate and work with scientific hypotheses. The same cannot be said of lay mathematicians and theoretical computer scientists.
The field of Genetic Algorithms, which was founded on a hypothesis, is quite possibly the first genuinely scientific discipline within Computer Science. Unfortunately, the field has been a poor standard bearer for the use of the scientific method within Computer Science. Whereas a healthy science moves towards increasing levels of vulnerability, a.k.a. falsifiability, hierarchical compositionalists, when confronted with various anomalies, armored up by moving toward unfalsifiability. For example, as criticism of the building block hypothesis mounted, Holland put forth the building block thesis (Holland 2000), an unfalsifiable theory that essentially contends that hierarchical composition is at play in a genetic algorithm because, well, hierarchical composition is at play almost everywhere. When push comes to shove, the refrain “What else might recombination be doing?” is a well worn fallback. In a mature field of science, such maneuvering would immediately be called out for violating the safeguards that are part of the scientific method. In the fledgeling scientific field of evolutionary computation, however, this critique was not leveled. Instead, logical positivism took root and “hypothesis” became a dirty word.
The Missing Royal Road Function
The way forward for foundational genetic algorithms research lies not in doubling down on logical positivism or hierarchical compositionalism, but in scientific rigor which — to reiterate one of the main points of the previous section — is more than mere mathematical rigor. The cornerstone of scientific rigor is the submission of a falsifiable hypothesis based on weak assumptions for review and refutation. Scientific rigor must be continually maintained through a sustained, community-wide effort to make predictions and test them (e.g. the prediction of the Higgs Boson, and the use of the Large Hadron Collider to search for it). Anomalies must be spotted and called out. Resolving them should be an urgent matter. Ignoring them should be unacceptable.
Like all search algorithms, genetic algorithms perform optimization by exploiting some structural property of the fitness function to which it is applied. Therefore, any scientific hypothesis about the workings of genetic algorithms must, at a minimum, be comprised of two parts:
- Structural Assumptions: A specification of the structural properties of fitness functions that get exploited by genetic algorithms
- Exploitation Story: A description of how these structural properties are efficiently exploited
Hierarchical compositionalism assumes the existence of multitudes of modules (low order schemata) that individually confer above average fitness. Also assumed is the existence of hierarchical synergy amongst these modules; i.e. the existence of second level modules that confer a higher fitness on average than the combination of the fitness increments conferred by their constituent first level modules; likewise, the existence of third level modules that confer a higher fitness on average than the combination of the fitness increments conferred by their constituent second level modules; and so on. These are the structural assumptions of hierarchical composition hypotheses. Here is the exploitation story: a genetic algorithm preferentially selects individuals with first level modules, then efficiently mixes the first level modules to create individuals with second level modules, then efficiently mixes those individuals to create individuals with third level modules, and so on.
In previous work I have criticized hierarchical compositionalism on the grounds that its structural assumptions are much too strong. (Remember, we are talking about fitness functions arising in practice through the ad hoc representational choices of genetic algorithm users.) Here I want to raise a different problem confronting hierarchical compositionalist: their inability to procure a royal road function in support of hierarchical compositionalism.
The idea behind royal road functions is simple: Given a hypothesis about the workings of genetic algorithms, it should be possible to procure a fitness function such that the following criteria are satisfied:
- The structural assumptions of the hypothesis are met by the fitness function.
- The behavior of a genetic algorithm when applied to the fitness function clearly matches the behavior described in the exploitation story of the hypothesis.
- The genetic algorithm is observed to be more successful at optimization on the fitness function than naive alternate approaches — hill climbing, for example.
A fitness function that meets these criteria is called a royal road function. Trying to find such a function accords well with the scientific method outlined by Popper. Essentially, a hypothesis about the workings of genetic algorithms should predict the existence of one or more royal road functions. Finding such functions validates the prediction and lends credibility to the hypothesis.
A formal review of the literature on royal road functions should at a minimum cover Mitchell et al’s original work on Royal Road functions (Mitchell 1998), Holland’s hyperplane defined functions (Holland 2000), and Watson’s HIFF functions (Watson 2006). Such a review is beyond the scope of this post. Long story short: hierarchical compositionalists have tried mightily to find a fitness function that satisfies the criteria listed above — without success.
For a Royal Road function on which a genetic algorithm trounces simulated annealing, but not because it behaves as described in the building block hypothesis, see the longer version of this post: When Will Genetic Algorithms Outperform Local Search
Heisenberg, W. (1971). Positivism, Metaphysics and Religion. In Ruth Nanda Nanshen. Werner Heisenberg — Physics and Beyond — Encounters and Conversations. World Perspectives 42. Translator: Arnold J. Pomerans. New York: Harper and Row. p. 213.
Holland, J. H. (2000). Building blocks, cohort genetic algorithms, and hyperplane-defined functions. Evolutionary computation, 8(4), 373–391.
Kuhn, T. S. (2012). The structure of scientific revolutions. University of Chicago press.
Latour, B. (1987). Science in action: How to follow scientists and engineers through society. Harvard university press.
Mitchell, M. (1998). An introduction to genetic algorithms. MIT press.
Nagle, E. and Newman, J.R. (2001). Godel’s Proof, New York University Press.
Popper, K. (2007a). Conjectures and Refutations. Routledge.
Popper, K. (2007b). The Logic Of Scientific Discovery. Routledge.
Vose, M. (1999). The Simple Genetic Algorithm: Foundations and Theory. Cambridge, MA: MIT Press
Watson, R. A. (2006). Compositional evolution: the impact of sex, symbiosis and modularity on the gradualist framework of evolution. MIT Press.