Things Computers Can Never DoPhilip J. ErdelskyFirst Published in Dr. Dobb's Journal May 1987 |
Please e-mail comments, corrections and additions to the webmaster at pje@efgh.com.
Anyone who has witnessed the enormous improvements in computers in the last 40 years may get the impression that computers will eventually be able to solve every well-defined problem. Progress in language understanding and other forms of artificial intelligence has been disappointing, but human language is full of ambiguities, so that's not a well-defined problem. Chess, on the other hand, is very well defined. Although it was once considered the epitome of intelligent activity, computers can now play chess better than all but a few human players.
Some problems, although well defined, are too large to be solved in a reasonable time even on our largest computers. But surely, if a computer could be freed from all limitations on time and memory, couldn't it solve any well-defined problem?
The surprising answer to this question, which was known to mathematicians even before the first real computers were constructed, is no. There are some things no computer can ever do because it can be proved that there are no algorithms to do them -- just as there is no way to square a circle with a compass and straightedge.
These things are not mere mathematical curiosities. They are things that programmers would like to have their computers do for them and things that the suppliers of software development tools would like to incorporate into their debuggers. Computer science curricula usually include the subject of uncomputable functions, but programmers who are not computer science majors sometimes ask for the impossible without realizing it.
Alan Turing in 1935 asked whether there is a method by which a computer program can determine whether any other computer program will halt. This is the famous "halting problem." Turing showed that it has no solution.
A debugger with this ability would certainly be useful. Failure to halt normally is a common form of program failure. Moreover, the debugger could be applied successively to parts of the failed program to isolate the part that is hanging up.
It is not obvious that such a debugger is impossible. Of course, the debugger can't just single-step the program to see if it halts. If the program doesn't halt, the debugger could run forever without determining that this is the case. Or it might give up just as the program is about to terminate, as human programmers sometimes do. At some point, the debugger would have to be able to say, "Aha! This loop is infinite!" It seems as though a cleverly written debugger, having all the tools of modern high-level languages at its disposal, might be able to do that.
The impossibility proof is based on the following argument. If you have a debugger that can solve the halting problem, given unlimited time and memory, then you can use the same code to make the debugger do other things, some of which are self-contradictory and hence impossible.
The particular computer language is not important. If you can solve the halting problem for one language, you can solve it for another. Just use a compiler or other translation program before solving the halting problem. Notice that translating an assembly-language program to a higher level language is quite easy, although the object program is bound to be inefficient. The goal, however, is to show that a solution to the halting problem is impossible, not merely inefficient.
Turing himself proposed a minimal machine that has come to be called the Turing Machine. Its memory was supposed to be infinitely long but only one bit wide, and the machine had only sequential access to it, as with a tape. The programming language was essentially a flowchart with only a few basic commands. Nevertheless, Turing showed that his machine was able to emulate any other machine, given enough time and a suitable program. Such a construction is not necessary for our purposes-you can imagine that the computer is programmed in some familiar high-level language.
Now consider the problem of determining whether a program can print out a specified string S (with or without other output). If you can solve the halting problem, you can solve this problem. Just replace every print statement in the program with a routine that does not send the output to the printer but keeps track of the output and halts when the string S appears. Then, to keep the program from halting for any other reason, replace all the halt statements in the program with endless loops. Then solve the halting problem for the result.
Such a program would be useful in itself because many run-time errors produce distinctive messages, and it would be helpful to predict in advance that such errors will occur.
Because this applies to any string S, you can also determine whether a program prints out a copy of itself. This is not as curious as it appears at first glance. It is easy to write a 1,000-character program that prints out all combinations of 1,000 characters, including itself. In fact, 1,000 characters is probably an overestimate of the number of characters required in most high level languages.
Now you can write a program to do the following things. First, generate, one by one, all possible programs. The easiest way to do this is to generate all strings and check each one to see whether it is a program. Compilers do this when they check syntax. Then check each program to see whether it prints out a copy of itself. Finally, print out a copy of every program that does not print out a copy of itself.
This program, in the process of generating all programs, will eventually generate itself. Does it print out a copy of itself? If it does, it is breaking the rule by printing out a copy of a program that prints out a copy of itself. If it does not, it is breaking the rule by failing to print out a copy of a program that does not print out a copy of itself. This fatal contradiction proves that the halting problem has no solution.
You may recognize this as Russell's paradox (the set of all sets that do not contain themselves) or as the barber paradox (the barber who shaves every man who does not shave himself).
Any problem that a debugger can convert to the halting problem, such as the string-output problem, is equally unsolvable. Some other obvious examples are:
Of course, a debugger or compiler can sometimes predict such errors -- for example, inaccessible code can sometimes be identified at compile time. But universal solutions to such problems do not exist.
The impossibility of determining whether two programs do the same thing means that it is always possible to defeat a certain kind of Trojan horse. In a lecture reprinted in the Notices of the ACM (August 1984), Ken Thompson argued that he could put a Trojan horse into a C compiler that would miscompile the login statement to allow him access to any Unix system compiled with it, and it would miscompile the C compiler to insert a copy of itself. The Trojan horse itself would not appear in the source code for the C compiler. In a letter to the editor, Steve Draper noted that such a Trojan horse can be defeated by paraphrasing the C compiler (writing different code that does the same thing) and then recompiling it. No Trojan horse can infallibly recognize paraphrased programs -- hence there is always a paraphrase that will defeat the Trojan horse.
My own opinion in this matter is that, unless the Trojan horse were skillfully written, most paraphrases would defeat it, and in fact it would probably be defeated eventually by normal software maintenance. Any Trojan horse smart enough to recognize most paraphrases would probably be much larger than the rest of the C compiler. You'd never get it through the gates.
The halting problem is intimately related to two other problems, which were posed by the mathematician David Hilbert in 1900. Is there a formal proof or disproof for every mathematical statement? Is there an algorithm to find proofs?
The first question was answered in the negative by Kurt Gödel in 1931. Gödel's proof was complex, but if you accept the unsolvability of the halting problem, it can be proved simply. Whether a particular program halts is a mathematical statement. In fact, many mathematical theorems are already special cases of the halting problem because you can write a program to search for counterexamples and halt when it finds one. The theorem is equivalent to the assertion that the program never halts.
If there were always a formal proof or disproof of the assertion that a program halts, then you could simply generate all proofs (more or less as the program described earlier generated all programs) until you found either a proof or a disproof. That would solve the halting problem. Because the halting problem is in general unsolvable, there must be at least one mathematical statement of this kind that is undecidable -- that is, it cannot be formally proved or disproved.
This shows that it is impossible in general to prove that a program works. Specific programs or limited classes of programs can be proved to do certain things, but there is no way to do this for every program.
Given that some mathematical statements are undecidable, is there a program, the "decidability program," that can tell whether any mathematical statement is decidable, even without deciding whether it is true or false? As you might have guessed from the tone of this article, the answer is again no. If you have a decidability program, you can take any program and ask whether it halts. Then apply the decidability program to this question. If the question is decidable, a search of all proofs will prove it or disprove it. If the question is undecidable, then the program never halts; otherwise, you could prove that it halts by simply running it until it halts.
Therefore, theorem-proving programs, however successful they might be in limited areas, can never prove everything. Some things must always remain beyond their grasp.
These arguments are not rigorous in the mathematical sense because too much has been left out. A major part of Turing's and Gödel's work involved formalization of the concepts of "computation" and "proof" to the point at which their arguments would be accepted by mathematicians.
You may have already spotted one tacit assumption that does not correspond to reality. The programs are not constrained by memory limitations. If a program does have a memory limitation, then the halting problem can in theory be solved -- but only by a program with a much larger memory.
This is how it can be done. A program with a memory limitation has only a finite number of states. A debugger can single-step it, keeping track of the states it has occupied. If it occupies the same state twice before halting, it will repeat the same sequence of states indefinitely and will never stop.
To do this, the debugger needs enough memory to keep track of which states the program has occupied. Only one bit is required for each possible state, but the number of possible states for even a simple program is truly mind-boggling. Every combination of bits in the memory is a different state. Hence a program with only 1,024 bytes of memory has at least 2^{(1024 x 8)} states due to memory configuration alone, to say nothing of flags and registers. This number of flip-flops would not fit into the entire known universe. It can therefore be said that the halting problem has no solution even in this case.
It should be clear, then, that there are definitely some limits to what artificial intelligence can accomplish and that mathematicians' and programmers' jobs can never be completely automated. (This is a great comfort to me because I am a mathematician and programmer.
Only perfect solutions are impossible, however. It can still be argued, and it is argued by some, that artificial intelligence programs will eventually be able to solve every problem that the human mind can solve, with at least the same success rate. And if the only requirement is practical solutions, not perfect solutions, then many interesting but theoretically unsolvable problems can be solved.