2007-11-15 | Filed Under Programming
You’ve certainly heard of Dijkstra’s famous article: “GOTO Considered Harmful“. You may well even have heard about how it was actually Niklaus Wirth who gave it that title. These are interesting bits of history, but they’ve been covered by better bloggers than myself; my topic takes me a little beyond the history. Nothing energizes a blog like a good controversy, so my goal in this essay is to convince you that GOTO is actually a good feature, and encourage its use.
To lay the groundwork, I am going to take you back in history a bit, not back to the heady days of Dijkstra’s and Wirth’s prime, but to the mid 1980′s when I was first learning to program (it may sound like ancient history to some readers). Back in those days, much of our programming was done in a language called Basic. (Yes, the same language that gave Bill Gates his start.) In that language (and most languages of the time worked similarly), if you wanted a loop you wrote it like this:
10 LET X = 2
20 LET X = X + 1
30 PRINT X
40 IF X < 6 GOTO 20
Which prints the numbers from 3 through 5. The earliest versions didn’t just use GOTO for looping — they used it for functions as well. To begin a function one would GOTO it, and at the end the function would GOTO back to where it came from… or to someplace else. This road led to “spaghetti programs”, a horrible condition where the entire program became a morass of tangled threads of execution resembling a pile of knotted noodles.
The invention of GOSUB and RETURN helped some — the GOSUB command worked exactly like GOTO; when RETURN was later executed it returned executing at the most recent GOSUB. This allowed the “subroutine” code could easily be called from multiple places to do the same job. Pretty much everyone agreed that subroutines were a good idea, but the “spaghetti code” problem was so serious that some 20 years after “GOTO Considered Harmful” was published I still encountered that bad habit from time to time. The term “structured programming” came to mean code that eschewed GOTO, and for a time it was the hottest buzzword in programming. Fortunately, today (another 20 years on) spaghetti code really has been fully eradicated — most modern languages make it nearly impossible to do anything EXCEPT structured programming.
By the way, that’s why early programmers made so much use of flowcharts, whereas today one hardly ever sees one. Flowcharts are and ideal tool for visually representing the tangle possible flowpaths created by GOTOs, but they add little value in describing a structured program.
So far I haven’t done a very good job of convincing you that GOTO is good. But hang on, I’m getting there. Let’s pause for a moment and examine just what was so bad about the unstructured spaghetti code.
It is not that the unstructured code executes slowly. In fact, the underlying machine code implements all flow control using GOTO, so it’s naturally efficient. The difficulty is not exactly that the code is hard to read: a complete programming novice, if shown the code above, will have no difficulty understanding what line 40 does. No, the problem is in understanding the code — figuring out what it was intended for and what it will do when executed. Of course, the ability for humans (not just computers) to read and understand code is absolutely vital — certainly important enough to anathemize spaghetti code.
But I maintain that it’s not really the GOTOs that make it difficult to “understand” and reason about the code — it’s really the “COMEFROM”s. In the loop above, there’s nothing difficult to understand about line 40… but line 20 is more of a problem. When you read it, there’s nothing to indicate that X can be anything other than 3 — because there’s no special flag saying “Alert!! Someplace else there’s a GOTO that lands here.”
When a programmer reads through one piece of code (a procedure, perhaps, or a loop, she should be able to simulate it in her mind, and understand this piece in isolation from the rest of the program. Modern software is far too complex for one to hold entire software applications in mind, but each piece can be understood on its own, then the larger whole can be constructed through composition. This is extremely powerful, but it only works if the pieces are independent, so each can be understood without having to consider the entire application. This is the exact same reason why global variables and hidden side effects are undesirable: they prevent the pieces from being understood independently.
Now, structured programming (as originally envisioned by Djikstra) maintained that each bit of code (subroutines, loops, individual IF branches…) should have only a single entry point and a single exit point. Hence, looping was accomplished with a “FOR”, “WHILE”, or “REPEAT-UNTIL” style loop, procedures begin at the top an exit at the bottom, and each clause of an “IF” or “SWITCH” statement is independent. But that is actually a stronger requirement than is needed: I claim it is enough that each bit of code should have only a single entry point. This allows one to analyze this particular piece of code independently — having multiple possible exits does not make the analysis more difficult. In other words, GOTO is fine so long as there are no unexpected COMETOs.
A few examples will make it much more clear. Having a RETURN in the middle of a function is fine — it’s a non-local exit. Same with a BREAK or CONTINUE inside of a loop — these exit the loop (or this iteration of it, respectively). Exceptions are also a kind of multiple exit. Of course, any of these could be implemented using GOTO, (and in the underlying machine language they are) but that’s actually a bad idea, because each GOTO has a corresponding unstated COMETO. Instead, it’s a good idea for the control structures of the language to support these multiple exits, and for programmers to feel free to use them.