Summary
An unprovable error is one that no amount of testing can ever be assured of revealing. With computers becoming more networked and security becoming more important, should we be thinking about how our programming tools can avoid unprovable errors?
Advertisement
Consider the following C program:
#include <assert.h>
int x = 0;
int incr() { ++x; return 0; }
int decr() { assert (x > 0); --x; return 0; }
int main()
{
incr() + decr();
return 0;
}
Whether this program runs or trips the assert depends on the order in which the implementation calls incr and decr. If the implementation happens to call incr first, the program will succeed; otherwise it will fail.
Consider the problem of finding this error if all you have is a C implementation that always evaluates the left-hand side of an addition first. The only way that testing will reveal the problem is if you run the program on a different implementation.
This is a small example of a larger problem: Some errors simply cannot be detected by testing because language definitions and implementations give enough leeway that it might always be possible to get the "right" result by coincidence. When a program with such an error is run on another implementation, the error might become serious.
I propose to call such errors unprovable errors. This coinage is by analogy with Goedel's Theorem, which says that any consistent logical system will have statements that are true but cannot be proved within the system. Analogously, it is possible for any programming language (at least any I know) to allow programs with errors that no amount of testing can be assured of revealing.
Pragmatically, some languages make it much harder to write such programs than others. A useful property for such a language is to assure that programs will run exactly the same way on all implementations, and to assure also that the implementation will detect a program that does something that the language definition does not allow. If you like, anything that language definitions refer to as "undefined behavior" is an example of an unprovable error.
I would like readers to wonder out loud along with me about what it might be possible to do about designs of future programming languages and tools to make unprovable errors less common.
I would like a compiler switch that causes all "undefined behavior" events to result in something drastic - probably just ending the application, but perhaps with some intelligent logging. Since the language definition permits this, it's still compliant, and I get to see the bugs that much sooner.
This won't identify all unprovable errors, but I think it would catch a fair number of them.
> I would like a compiler switch that causes all "undefined > behavior" events to result in something drastic - > probably just ending the application, but perhaps with > some intelligent logging. Since the language definition > permits this, it's still compliant, and I get to see the > bugs that much sooner.
> This won't identify all unprovable errors, but I think it > would catch a fair number of them.
I wonder. Look at part of my example again:
incr() + decr();
According to the C and C++ standards, this expression yields undefined behavior, essentially because writing any code that depends on order of evaluation yields undefined behavior. However, it is impossible to detect this problem without seeing the definitions of incr and decr.
Now, suppose this expression is compiled separately from the definitions of incr and decr. The compiler cannot possibly tell that incr and decr happen to make this expression's behavior undefined, because it may be that I haven't even written incr and decr yet. So if the error is to be detected at all, it will have to be in the linker.
On the other hand, if the language definition were changed so that the left-hand side of + is always evaluated before the right-hand side, this particular problem would go away.
This is an example of what I mean when I talk about "what it might be possible to do about designs of future programming languages."
> On the other hand, if the language definition were changed > so that the left-hand side of + is always evaluated before > the right-hand side, this particular problem would go > away. > > This is an example of what I mean when I talk about "what > it might be possible to do about designs of future > programming languages."
Is the problem is already solved? If I remember correctly, Java is left-first. Are there any languages other than C and C++ which leave order of evaluation undefined?
But that is just 'order of evaluation ambiguity.' It would be neat to build up a list of other unprovable errors.
> Is the problem is already solved? If I remember > correctly, Java is left-first. Are there any languages > other than C and C++ which leave order of evaluation > undefined?
Pascal, PL/I, and Fortran 77 leave order of evaluation undefined. I don't know about Fortran 90.
> But that is just 'order of evaluation ambiguity.' It > would be neat to build up a list of other unprovable > errors.
Indeed.
For example, in Java, I believe that programs that do equality comparisons between strings may well contain unprovable errors. As another example, I think it's hard to write concurrent programs that do not contain unprovable errors.
Simple: Next time you talk to those people writing the C or C++ specs tell them to fix the problem by declaring how an expression should be evaluated. When the compiler builders finally catch up then the case in question should be a bona-fide error.
Seriously, for example in C++ there are so many things that compilers don't implement properly that you don't need to be a purist to figure out that there is no way to guarantee that *any* code will run properly under any combination of {platform/compiler/external libraries}.
The only possible/sensible solution is to build an extensive test suite in addtiion to the build scripts and then run all the tests after building. For example the Perl distribution does this wonderfully, even for third party libraries. I guess that was because there was a very small number of people involved in the standards commision when Perl was designed :-)
> <p>I propose to call such errors <strong>unprovable > errors</strong>. This coinage is by analogy with Goedel's > Theorem, which says that any consistent logical system > will have statements that are true but cannot be proved > within the system. Analogously, it is possible for any > programming language (at least any I know) to allow > programs with errors that no amount of testing can be > assured of revealing.
Is this strictly true in the case of a language? For example, and I'm obviously making this up, couldn't one construct a pre-compiler that "knew" all about the various undefined behaviors and then do the transpositions to arrive at all possible interpretations?
> <p>Pragmatically, some languages make it much harder to > write such programs than others. A useful property for > such a language is to assure that programs will run > exactly the same way on all implementations, and to assure > also that the implementation will detect a program that > does something that the language definition does not > allow. If you like, anything that language definitions > refer to as "undefined behavior" is an example > of an unprovable error.</p>
It may be the case that there would be some value in understanding specifically why languages have undefined behaviors. This might reveal an underlying commonality amongst unprovable errors.
There are other issues that make things "unprovable" more in an algorithmic sense. I'm presuming the discussion is not related to those?
> Simple: Next time you talk to those people writing the C > or C++ specs tell them to fix the problem by declaring how > an expression should be evaluated. When the compiler > builders finally catch up then the case in question should > be a bona-fide error.
> I wonder. Look at part of my example again: > > incr() + decr();
So called effect systems (related to type systems, but concern side-effects rather than type structure) can infer and prove important properties for code optimization. For example, assuming the language would guarantee a specific order of evaluation of sub-expressions (like Java does), an effect system might be able to prove that it can change the order of evaluation of sub-expressions without changing external behavior (semantics). I think that in the future we will have (more) languages that make (more) effect information visible to the programmer like static type systems do for types. The type systems of some well known (e.g. Haskell) languages already effectively provide some effect information along with types.
In brief, the way to handle unprovable errors is to remove the necessity to prove them in the first place.
With regards to language usage, that means two things:
1. From the developer's side, do not write code that is dependent on the compiler you use, especially when it's syntax related. Depending on a compiler to evaluate in left to right order (as in your example) is not good coding. 2. From the language creator's side, remove the possibities that there could be multiple ways of implementing a language construct. In your example, allowing different compilers to implement the order of execution in different ways is a fault of the language definition.
The end result is less flexibility, but I've found that greater flexibility leads to a greater chance of errors. The right balance between strictness and flexibility is required to remove ambiguities that allow for "unprovable errors."
It is facinating to see the parallel is to early 20th century math, i.e.: the Russell/Whitehead approach of making mathematics strictly axiomatic, "firming up" the base on which modern mathematics is built. Godel (as cited rather appropriately, and others more emphatically) established the existence of "unreachable" propositions as intrinsic to any sufficiently expressive system. Chicken Little goes wild as the sky crumbles! The circle will never be squared!
But in the end, it was seen that axiomatic was still a "good thing" and most realized mathematics is indeed secure despite not being "absolute". (At the risk of getting political, it is likewise desirable to "do the right thing" even in a relativistic view of social order without strict absolutes - no, just don't go there ;-)
My point: given that unprovables are strictly unavoidable (thread-related unprovables and runtime linking have already been cited, as well as unprovables intentially left in to provide room for implementers to optimize across a wide range of platforms) that we should embrace (to some extent) while simultaneously minimizing the obvious downside of the problem.
Any programming endeavor is a balance of objectives, many in contention with each other. By this, I'm all things, including correctness, reliability, performance (many faces), cost of coding, maintainability, and even "beauty" (I'm kinda old school that way).
A straight jacket that completely vanquishes things like unprovable errors would be embracing one facet, albeit extremely important, at the cost of many others.
Nonetheless, I do think adding the term "unprovable errors" is a good and useful addition to the way we talking about and deal with code quality control. I just hope we can keep in mind that it is not simply a problem to be solved but part of richness of our field.
> If the particular code cannot be demonstrated to behave > in the way you want, isn't the solution simply to > replace it with different code that does?
Of course it is--provided that you can somehow recognize that the code has something wrong with it.
But unless you are saying that every piece of code should always be accomanied by a correctness proof, such demonstrations are impossible in practice. No amount of inspection can guarantee the absence of errors, any more than testing can guarantee it. And correctness proofs for substantial programs do not seem to be possible on a large scale right now, nice as they might be.
> From the developer's side, do not write code that is > dependent on the compiler you use, especially when it's > syntax related. Depending on a compiler to evaluate in > left to right order (as in your example) is not good > coding.
Your advice is, of course, quite sound. My experience, however, is that it is not possible to be able to count on programmers following it consistently in practice. The number of things one has to remember is just too large. I would like to be able to use mechanical means to make that number smaller.
Flat View: This topic has 19 replies
on 2 pages
[
12
|
»
]