Submitted by egdaylight on

## Dated:

The topic of `Category Mistakes,' which I have discussed in my four previous blog posts is not limited to digital systems. Engineers in all areas use models and they, too, can potentially fall into the trap of mistaking the model for reality.

Dave Parnas told me in August 2016 that one of his favorite examples in this regard is `Ohm's Law,' which states that V = I*R. In reality, the ratio between current (I) and voltage (V) varies, i.e., the relationship is not actually linear. Engineers can use this law to analyze a circuit and verify that it has the desired behavior only to find that the circuit fails in certain conditions. Parnas gives the following example:

The circuit may overheat and the resistance (R) will change. Ohm’s Law can be made an accurate description of the device by adding conditions, e.g. V-I*R < Delta if a < I < b etc. The result will be much more complex but it will give conditions under which the model’s conditions are accurate. Even this is a lie as it makes assumptions about ambient temperature and the aging of the components that are not stated.

Parnas's main point is that engineers must be aware of the limitations of models as descriptions. More on Parnas's views can be found in the first panel discussion held at ETH Zurich in November 2010. Parnas concludes by noting that this awareness is completely missing in the UML (Unified Modeling Language) and MDE (Model-Driven Engineering) literature. Can the same be said about the programming language literature?

Category Mistakes are also typically found in physics. A colleague in computer science recently explained to me that several publications contain statements of the following kind:

Goedel proved that time travel is possible.

What Goedel did prove is that time travel is in principle possible according to his mathematical model (i.e., with his chosen equations). Of course, one could argue that physicists frequently choose to craft their sentences with brevity in mind and that they *are* well aware of the fact that they are modeling in the first place. This is precisely the kind of response I received from a POPL referee with regard to my rejected paper, entitled: `Category Mistakes in Computer Science at Large.' When I give this kind of response to my colleague in connection with the previous statement about Goedel, he insists that some physicists really do believe that Goedel proved that time travel is possible. As I have argued in my previous blog posts, I believe the same can be said about computer scientists.

Similar remarks can be made about several other disciplines, including linguistics and mathematics. Concerning the latter, here is what V.I. Arnold said in 1997:

The mathematical technique of modeling consists of [...]

speaking about your deductive model in such a way as if it coincided with reality. The fact that this path, which is obviously incorrect from the point of view of natural science, often leads to useful results in physics is called “the inconceivable effectiveness of mathematics in natural sciences” (or “the Wigner principle”). [my emphasis]

Likewise, researchers in computer science actually believe the following statement:

“full formal verification, the end result of which is a proof that the code will always behave as it should. Such an approach is being increasingly viewed as viable.” — citing Michael Hicks

So Hicks says we can have a mathematical proof about the behavior of an engineered system. Not so. The best we can do is prove mathematical properties about *some* mathematical model of *the* running system. And doing so can, indeed, have great practical value. Moreover, I advocate — and I also believe this is happening (and always has been happening) in engineering — using *multiple* (and complementary) semantic models for the *same* digital technology; that is, for the same C computer program. Doing that will allow us, engineers, to have some more confidence that the code behaves “as it should.”

**1. The POPL Gatekeeper**

It did not come as a surprise to me when I received an email on October 3rd, 2016, explaining that my POPL paper `Category Mistakes in Computer Science at Large' was officially rejected. Nor do I in any way wish to ridicule the peer review process, which I consider to be professional and honest. I firmly believe the reviewer did a very good job in:

- Justifiably preventing my unorthodox analysis from entering the POPL'17 proceedings.
- Protecting
*me*from making a fool of myself in an auditorium full of programming language experts.

That said, I still think that the POPL referee did not fully comprehend my analysis (which, again, does not mean that s/he should have accepted my paper for publication). It should also be remarked that there was a second reviewer who rejected my paper too. His or her comments were very brief and I shall therefore postpone discussing them. I will disseminate my rejected POPL paper along with the two reviews in the near future. Why not? Seeking a dialogue between different groups of scholars can lead to *conceptual clarity*. Here's a time line of events:

- In the first week of July 2016, I submitted my POPL paper (to be disseminated later).
- On September 14th, 2016, I received two peer reviews (to be disseminated later).
- On the same day I wrote and submitted a short rebuttal (presented below).
- On October 3rd, 2016, I received the official rejection of my POPL paper, along with a reply (presented below) from the first reviewer pertaining to my rebuttal.

**My Rebuttal (September 14th, 2016)**

The reviews are very good and informative.

I am willing to believe the first reviewer when he says that my findings are already well understood in the POPL community. It should be remarked though that the CACM community, by contrast, fundamentally disagrees with my "category mistakes" and thus also with the reviewer's views. What to do about this?

Moreover, multiple POPL members still tell me that it *is* possible to fully verify a digital *system*, even after having studied my work in detail. So the reviewer's claim about the POPL community does not mix well with my daily experience. Please tell me what I can do about this? Should I abandon this research topic?

I do take gentle issue with the way the first reviewer frames my works: I am not presenting "anecdotes" nor am I just citing a few papers. Clearly my reference to Parnas and Chaudhuri backs up *my* story, not the reviewer's.

**Reply by Reviewer #1 (October 3rd, 2016)**

I'm not sure why CACM reviewers would ignore the difference between real-world systems and their mathematical models. I don't actually see that mistake in the quoted reviews. The standard convention is that a reference to a programming language refers to the abstract mathematical object, which is unproblematic, since today it is routine to define full-fledged languages unambiguously.

I don't think the Parnas-Chaudhuri exchange is problematic, because Parnas is not a member of the POPL community and likely ran into similar problems interpreting papers from it.

**My analysis (October 4th, 2016)**

Four quick points:

1. The reviewer is quite right in stressing that “today it is routine to define full-fledged languages unambiguously” and perhaps I miss-portrayed (albeit in just one or two sentences) the accomplishments of the POPL community in my rejected paper.

2. I'm afraid reviewers of the CACM *have* applied faulty reasoning and I also believe I have illustrated *precisely that* in my POPL paper (but that's a topic I will address another day).

3. Reference to a programming language as an abstract mathematical object is only “unproblematic” if we keep in mind that it is *a* mathematical model of *the* technology under scrutiny. That's why I explicitly distinguish between a mathematical programming language and a computer programming language. They are categorically distinct. Furthermore, assuming by convention that there is a 1-to-1 relation between a computer program and a mathematical program is fine. But using that premise to project results from, say, computability theory onto engineering is often wrong and misleading. (See my previous four posts for more information, starting with the first one: here. I think my post about Michael Hicks makes this point clearly.)

4. The reviewer sidesteps my claim that “multiple POPL members still tell me that it *is* possible to fully verify a digital *system*, even after having studied my work in detail.” I am willing to bet that the reviewer has no clue what I am talking about; i.e., that s\he has not understood my paper very well after all. Of course, that is first and foremost *my* mistake. I am, after all, still struggling to find the right words and tone to get my message across to the POPL community.** **

But what I want to discuss here and now is the implicit statement that Dave Parnas and I are “non members of the POPL community” (which is OK) and **therefore*** we have misinterpreted* — and continue to misinterpret — the work of Chaudhuri and other POPL members (which is not OK). Talking about the elite of computer science! From a sociological angle, that's a rather impressive remark. It fits right in the 2004 book *Mechanizing Proof: Computing, Risk, and Trust*, written by the sociologist Donald MacKenzie. Instead, the POPL community could welcome the scrutiny of Parnas and like-minded engineers in order to prevent two big communities (software engineering and computer science) from drifting further apart.

In retrospect, surely Chaudhuri et al. grasp the difference between a computer program and a mathematical program. (After all, who would not be able to make such a simple distinction?) Moreover, based on the extensive feedback from the first reviewer (not shown in the present blog post), I am inclined to re-phrase several sentences of my rejected POPL paper. I will only do so in future writings and intentionally present my original and imperfect narrative in the sequel.

That all being said, and as you can judge for yourself below, it is Dave Parnas who seeks conceptual clarity and who *does understand* what the POPL community is doing. Parnas is asking Chaudhuri et al. to be more precise in the way they formulate their research findings and not to oversell their mathematical results. It is Parnas who wants to differentiate between the “computer program” and its mathematical model(s) while Chaudhuri et al. apparently do not want to abide with him.

As promised, now follows the introduction of my unorthodox and rejected POPL paper.** **

**2. The Introduction of my Rejected POPL Paper**

In his 1985 paper `The limits of correctness' [39], Brian Cantwell Smith discussed an important issue concerning program verification. He pointed out that a program always deals with the world through a model, it cannot deal with the world directly. There always is a layer of interpretation in place. Therefore, if we define a correct program as a program that does what we intended it to do, as opposed to what we formally specified it to do, proving a program correct is intrinsically very difficult. (The word “we” refers to my former student, Rosa Sterkenburg, and myself. Rosa is a co-author of the present introduction but carries no responsibility for the rest of this article.)

To illustrate the difficulty underlying Smith's *world-model* relation, assume we have a mathematical model of the world in which all apples are modeled as green and all strawberries as red. With this model it is quite simple to distinguish an apple from a strawberry, simply use the color to differentiate between the two kinds of fruit. If we thus write a program that can discriminate between red and green, then, at least with respect to our model, we can mathematically prove that our program can indeed distinguish between apples and strawberries. However, in the real world there exist red apples as well. So our model is not perfect; our proven correct program will tell us a red apple is a strawberry. As Smith put it:

Just because a program is ‘proven correct’, you cannot be sure that it will do what you intend. [39, p.18]

Smith argued that it is necessary for our model of the world, or *any* mathematical model for that matter, to be incorrect in some ways. For, if a model were to capture everything, it would be too complex to work with [39, p.20-21]. In the apple and strawberry example, the model need not capture the molecular structure of the fruits; doing so would presumably over-complicate matters for the application at hand. We need models to abstract away irrelevant — and, unfortunately, also relevant — aspects of the world in order to make an industrial problem mathematically tractable.

More important for the present article is the similar, yet slightly different, *programming-model* relation; that is, the relation between a programming language and the computer programs written in that language on the one hand and their mathematical counterparts on the other hand. This is a relation that Smith did not address in his paper. When proving a property about a computer program, such as deadlock prevention, some mathematical model of that program is used instead of the program itself. As a result, our formal proof about the program tells us something about it's model, and only indirectly something about the program. Here, too, we have the problem that our model is not perfect for it relies on one or more key abstractions from what is specified in the programming language's manual. Depending on what is written in that manual and how we choose to model the language at hand, two possible — and very different — key abstractions could be:

- abstraction A_{I}^{pos}, which allows for the representation of
*arbitrary large positive integers*, and - abstraction A_{R}, which allows for the representation of
*infinite precision real numbers*.

(I could equally well consider some related issues — such as (1) an unbounded number of memory locations that can be addressed, and (2) an unbounded length of programs that can be compiled — but choose not to do so here.)

Moreover, the language manual, as David Harel puts it nicely in his book *The Science of Computing*, contains “a wealth of information concerning both syntax and semantics, but the semantical information is usually not sufficient for the user to figure out exactly what will happen in each and every syntactically legal program” [23, p.52-53]. Defining a formal syntax of a programming language is not considered a major problem anymore since the advent of the Backus-Naur Form (BNF) notation, but finding adequate formal semantics has remained a vexing research topic from the early 1960s up till this day.

In the present exposition a “mathematical model” of a program (or programming language) refers to both the formal syntax and a formal semantics of the program (or programming language) at hand, although the formal syntax will often go unmentioned. It should be noted that in model theory, by contrast, numerals and other expressions (cf. syntax) are *modeled by* the natural numbers and other mathematical objects (cf. semantics).

To recapitulate, a mathematical model of a computer program is imperfect in one or more ways for its semantics relies on one or more key abstractions. As a result, and this is an important point for the rest of the paper, *showing that something cannot be done according to a model does not necessarily imply that it cannot be done in practice*. Of course, one's formal semantics may become the standard definition of the programming language under study, but that does not invalidate the italicized statement made in the previous sentence.

**The Chaudhuri-Parnas Dialogue**

To already illustrate some of the issues just raised, we briefly turn to two recent writings of Swarat Chaudhuri [5,6]. Chaudhuri has studied properties of programs that are necessary conditions for a correct program to possess in his mathematical framework. Starting with his 2012 article `Continuity and Robustness of Programs' [5], Chaudhuri and his co-authors, Gulwani and Lublinerman, discuss a way to prove mathematical continuity of programs. A continuous program is not per se correct, but a correct program needs to be robust, and therefore, according to the authors it needs to be continuous.

Chaudhuri, Gulwani, and Lublinerman do not address the world-model relation nor the programming-model relation in their 2012 article. While the first relation does not seem very relevant to us either with regard to their article, the second relation is, we believe, significant. To put it candidly, Chaudhuri et al. do not seem to be aware that they are not proving something about their programs, but that they are proving something about some model of their programs. This is very clearly pointed out by David Parnas in a reaction to their article. Parnas says:

Rather than verify properties of an actual program, it examined models of programs. Models often have properties real mechanisms do not have, and it is possible to verify the correctness of a model of a program even if the actual program will fail. [36]

Parnas specifically describes what he thinks is problematic about the mathematical assumptions underlying Chaudhuri et al.'s model:

The article ignored the problem by both declaring: ‘...our reals are infinite-precision’ and not specifying upper and lower bounds for integers. [36]

Because of these abstractions, which include what we have called abstractions A_{R} and A_{I}^{pos}, Parnas concludes in a similar way to what Smith wrote in 1985. Proving something correct does not mean it will do what you want. “Some programs,” Parnas continues,

can be shown to be continuous by Chaudhuri’s method but will exhibit discontinuous behavior when executed. [36]

Chaudhuri et al. reply to this critique as follows:

From a purely mathematical perspective, any function between discrete spaces is continuous, so all

computerprograms are continuous. [36, my emphasis]

This reaction suggests that Chaudhuri et al. are not aware that considering a computer program as a mathematical function is a modeling step. A computer program is — and this will become our main conceptual point — categorically different from a mathematical program, even if the latter relies solely on finite abstractions. That is, a C program residing in a laptop (i.e., a computer program) is categorically distinct from its mathematical model, even if the semantics is a “real semantics,” i.e., a “detailed and cumbersome low-level model” of C. (The words just cited come from Harvey Tuch et al. [43] and we shall see in Section 5 whether Tuch et al. agree with our categorical distinction or whether they reason like Chaudhuri et al.)

In a later paper entitled `Consistency Analysis of Decision-Making Programs' [6], Chaudhuri, Farzan, and Kincaid do, however, address some of the issues raised by Smith in 1985. They write:

Applications in many areas of computing make discrete decisions under uncertainty, for reasons such as limited numerical precision in calculations and errors in sensor-derived inputs. [6, p.555]

Since the understanding of nontrivial worlds is often partial, we do not expect our axioms to be complete in a mathematical sense. [6, p.557]

These words are related to the model-world relation, which was not addressed in Chaudhuri's paper about continuity. With this remark Chaudhuri acknowledges that mathematical modeling of the real world is needed to get things done on a computer. Chaudhuri also indicates that this understanding usually is not perfect, which is close to Smith’s point that one can never aspire having a perfect mathematical model. That said, Chaudhuri does not address the stringent programming-model relation and, hence, does not respond to Parnas's criticism.

Smith, Parnas, and some other software scholars consistently and sometimes explicitly distinguish between *concrete physical objects* such as a laptop, *technical artefacts* such as the programming language C, and *mathematical models* such as a finite state machine and a universal Turing machine. According to them, both a laptop and a finite state machine are finite objects, but the former is physical while the latter is mathematical. Likewise, equating a laptop with a universal Turing machine is problematic, not primarily because the former is finite and the latter is infinite, but because the former moves when you push it and smells when you burn it while the latter can neither be displaced nor destroyed.

**References**

[5] S. Chaudhuri, S. Gulwani, and R. Lublinerman. Continuity & robustness of programs. Communications of the ACM, 55(8):107–115, 2012.

[6] S. Chaudhuri, A. Farzan, and Z. Kincaid. Consistency analysis of decision-making programs. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 555–567, 2014.

[23] D. Harel. *The Science of Computing: Exploring the Nature and Power of Algorithms*. Addison-Wesley, 1987, 1989.

[36] D. Parnas. On Proving Continuity of Programs. Letter to the Editor of the Communications of the ACM, 55(11):9, November 2012.

[39] B. Smith. The limits of correctness. ACM SIGCAS Computers and Society, 14,15:18–26, January 1985.

[43] H. Tuch, G. Klein, and M. Norrish. Types, Bytes, and Separation Logic. In Principles of Programming Languages, 2007.