Article Status: Draft Release 0.4.3
In this section, we will derive important conclusions from real life observations.
To this avail, we need to pick someone very special.
Well, ehr, we can’t use me. Why? Because I’m a Java and C++ programmer. Because I’m no expert. Because I’m not used to any other kind of programming than what is called Imperative Programming. You know, although I certainly ain’t nobody, I’m still not too far from being your regular Joe.
Now, thinking of this, I know someone who’s not your average Joe at all. Yes, that’s right: we’ll pick him! This guy’s uniquely qualified to derive objective conclusions. Ladies and Gentlemen, allow me to present you the John Smith.
Why John Smith? Because, you see, John Smith is the only guy in the world who’s a complete expert of all the programming language dialects I’m going to use in this article. All of them, no trick!
More importantly and more impressively, John Smith learned those languages perfectly concurrently, and used them equally ever since. And yet, as if this was not enough qualifications for him, John has been spending 1 hour per working day since the start of his career in 1996, speaking about these different languages with his dozens of siblings who go by the same name as his’, but work in very different contexts. And guess what? They all share this very same background. They all think along the same lines as John’s. Can you believe it? Incredible, isn’t it?
Now, consequently, allow me to truthfully claim that John Smith is, out of any possible doubt, the very best person to help us reach meaningful conclusions about the fundamental differences of these languages.
Doesn’t it feel great to have John Smith with us to help us think this all through? Thanks John, I owe you one!
Now that we have introduced John Smith, let’s talk about the program we will use to compare our different languages.
John said any program could do, so for illustration purposes, this program will be a recipe I chose by asking Google’s first result for recipe
, and clicking on two links. The first one led me to a deserts page, and the second one to the yummy-looking Blueberry Buckle. This recipe’s directions are what we will program.
For the sake of simplicity, I have inserted the ingredients and their measures right into their preparation directions:
- Preheat oven to 375 degrees. Grease one 8×8 inch pan.
- Cream together 3/4 cup sugar, 1/4 cup shortening, and one egg.
- In a separate bowl mix together 2 cups flour, 2 teaspoons baking powder, and 1/2 teaspoon salt. Stir into sugar mixture, alternating with 1/2 cup milk. Stir in 2 cups fresh blueberries. Pour into greased 8×8 inch pan.
- To make topping: Combine 1/2 cup sugar, 1/3 cup flour, 1/2 teaspoon ground cinnamon, and 1/4 cup butter. Sprinkle over cake batter.
- Bake at 375 degree for 25-30 minutes.
Now, let’s see how this may be programmed. Please note that in the following versions of these directions, I have, out of pure laziness, left some functions undefined.
Understanding the concept of declarative programming is useful to see the simplicity of functional programming, since the latter is a special case of the former. Therefore, the following definition should help.
One could think of a declarative language as one which expresses computations as a hierarchy of result descriptions.
Following this line of thinking, functional programming expresses computations as a hierarchy of function applications, where each function produces a result.
When a programmer uses the functional style, he is mostly concerned with defining and combining functions and their applications.
Since the mechanism of function application, or in other words, function call, is universally known, functional programming is far from being inaccessible. And that’s precisely what might come as a surprise to one who has heard that functional programming is a weird way of thinking of the process of writing software.
Before we ponder onto why and how we’ve got to think that functional programming is weird, let’s dive into a few declarative languages to see how they look.
Note that I have taken the liberty to personalize their syntaxes so that they read more easily to the eyes of a java-like language programmer. Therefore, please try not to reason too much about the meaning of how the parenthesises are used. The code examples are not there to help you learn these languages. They’re rather there to help you see more quickly the fundamental semantics of each language. The biggest changes I’ve made was adding a bunch of parenthesises that should not be where they are, and removing some others whose placement would have felt awkward to any java-like language programmer.
Remark how this code says only what’s needed:
(defn blueberry-buckle
(bake
(oven (375))
(ready-cake ([25..30]))))
(defn ready-cake
(sprinkle
topping
(pan-cake))))
(defn pan-cake
(pour
(blue-cake)
(pan ([8,8] crisco))))
(defn blue-cake
(stir
blueberries (cup 2)
(smooth-cake)))
(defn smooth-cake
(mix-progressively
milk (cup 3/4)
cream
base))
Now, here are a few global variables. They would normally be included as local variables of the functions that need them, but I wanted the code to be easier for you to read, hence these separate definitions:
(def topping
(mix
sugar (cup 1/2)
flour (cup 1/3)
cinnamon (teaspoon 1/2)
butter (cup 1/4)))
(def cream
(mix
sugar (cup 3/4)
crisco (cup 1/4)
egg ()))
(def base
(mix
flour (cup 2)
baking-powder (teaspoon 2)
salt (teaspoon 1/2)))
Here are a few interesting properties of this code:
According to John Smith, this version of the program is the fastest and easiest to read and write. He says it’s also one of the fastest and easiest to maintain.
(He already took a look at the other versions of this code example that follow. What a dude!)
Hey, doesn’t this mean that the Java version isn’t the fastest and easiest one?
Doesn’t this feel weird to you? John may say this, but what’s the heck anyway with these weird functional, academic-level languages, right? Such languages aren’t fit for critical production usage. They would crumble under hardship. They’re not battle-tested. They’re not widely used. They’re too hard to learn. Right?
Wait, John says these assumptions of ours are all false. How could it be? Let’s skip to the next section, may we?
In this language, each function name is written on the left side of =
, which means “… is described as …”. Obviously, the right side of =
describes what computations each function depends on to produce its result.
blueberryBuckle = bake(Oven(375), readyCake([25..30]))
readyCake = sprinkle(topping, panCake)
where topping = mix([Sugar(Cup(1/2)), Flour(Cup(1/3)),
Cinnamon(Teaspoon(1/2)), Butter(Cup(1/4))])
panCake = pour(blueCake, Pan([8,8], Crisco))
blueCake = stir(BlueBerries(Cup(2)), smoothCake)
smoothCake = mixProgressively([Milk(Cup(1/2)), cream, base])
where cream = mix([Sugar(Cup(3/4)), Crisco(Cup(1/4)), Egg()])
base = mix([Flour(Cup(2)), BakingPowder(Teaspoon(2)),
Salt(Teaspoon(1/2))])
Now, let’s implement three of the unimplemented functions, just to see how algorithms are expressed in Haskell:
mix(ings) = stir(ings, Mixture())
mixProgressively(ings) = mixParts(map(/10, ings), Mixture(), ings)
where mixParts( _, mixed, [[]:_]) = mixed
mixParts(parts, mixed, rests) = mixParts(parts,
stir(parts, mixed),
map(-(parts), rests))
stir(ings, mixture) = foldr(incorporate, mixture, ings)
Here are a few interesting properties of this code:
According to John Smith, this version of the program is the second fastest and easiest to read and write. He adds, though, that out of the box, Haskell goes great lengths to guarantee a lot of things that are impossible to guarantee in most other languages’ proposed usage. Therefore, he says, Haskell is the fastest and easiest to maintain. And for big programs, he says it’s also the fastest to write.
What, again? Where’s Java?
Aside from the fact that imperative languages are the tool to which we’re used, how could one define this concept?
One could think of an imperative language as one which expresses computations as a sequence of manipulations, which must imperatively be followed in the order in which they are expressed, affecting the state of named memory blocks that are shared, so as to progressively build a result out of them.
When a programmer uses the imperative style, he is mostly concerned with the interactions of his code lines with the named variables that progressively help building the result.
Ah, home, sweet home. How reassuring is the smell of home, sweet home, isn’t it?
public class Chef {
public static BlueberryBuckle main(String args[]) {
Chef chef = new Chef();
Oven oven = new Oven();
Pan pan = new Pan(8, 8);
Grease shortening = new Crisco();
chef.prepare(oven, 375);
chef.prepare(pan, shortening);
Mixture cream = chef.mix(
chef.prepare(new Sugar(), Units.cup(3/4)),
chef.prepare(shortening, Units.cup(1/4)),
chef.prepare(new Egg()));
Mixture cake = chef.mix(
chef.prepare(new Flour(), Units.cup(2)),
chef.prepare(new BakingPowder(), Units.teaspoon(2)),
chef.prepare(new Salt(), Units.teaspoon(1/2)));
cake = chef.mixProgressively(
cake,
cream,
chef.prepare(new Milk(), Units.cup(1/2)));
cake = chef.stir(
cake,
chef.prepare(new Blueberries(), Units.cup(2)));
chef.pour(cake, pan);
Mixture topping = chef.mix(
chef.prepare(new Sugar(), Units.cup(1/2)),
chef.prepare(new Flour(), Units.cup(1/3)),
chef.prepare(new Cinnamon(), Units.teaspoon(1/2)),
chef.prepare(new Butter(), Units.cup(1/4)));
chef.sprinkle(topping, pan);
chef.waitFor(oven);
return chef.bake(pan, oven, 25, 30);
}
}
Now, try to imagine how you would program the unimplemented methods, like chef.mixProgressively(…). Try a little more. Form yourself a mental picture. Continue, it’s not clear enough. Ok, now it looks like something.
According to John Smith, this version of the program is the third fastest and easiest to read, write, and maintain.
John also mentions the unimplemented methods are much more involving to write than with our Haskell dialect.
Moreover, John mentions he uses quite a few faster- and easier-to-use languages than this one, that still fit in the category of object-oriented, imperative languages.
Isn’t this surprising, again? Let’s not hear him anymore on Java.
So, where’s the program?
Wait, we know it would look almost the same as that of Java, just much more awkward in large systems. No need to bother writing this, right?
I know you can easily imagine how you would program the unimplemented functions, like mixProgressively(…). We can easily see how awkward it would be to write them without them fitting in the greater scheme of the object-oriented paradigm.
According to John Smith, this version of the program is the second longest and hardest to read, write, and maintain.
Rock on, John! We now understand each other. No need to spend more time with C. Let’s see what’s next.
So, where’s the program?
Wait, did you really expect me to write this whole stuff in assembly language!? I mean, isn’t it, by all means, completely evident how hard and long it would take me to write and debug this stuff?
Try to imagine how you would program everything, including the unimplemented procedures, like mixProgressively. Wouldn’t it be longer to do than with Java’s for
comprehensions or C’s for
statement, and all those neat features we’re used to?
Now, according to John Smith, this version of the program is the longest and hardest to read, write, and maintain.
Ah! Rock on, John! We truly understand each other. Wait, that’s right, we weren’t used to think like you just some time ago…
If we consider that:
then we can safely derive some conclusions.
Each time we step closer to the machine-level details… |
…we need to invest more mental effort and more time to write something. |
|
Each time we step farther away from the machine-level details… |
…we need to invest less mental effort and less time to write something. |
Now, my question is, who on earth decided that Java was far enough from machine-level details?
In other words, who, after having known the enlightenment coming from stepping away from the machine-level details, decided it would be wrong to pursue further enlightenment by continuing to step away from the machine-level details?
Well, let’s get back to constructive stuff.
Isn’t it worth to pursue such clarity and succinctness?
Buckle (Blueberry)
Of course, no general-purpose programming language allows that level of expressivity (that is, without writing anything to back the expression given in the example). This example is clearly an exaggeration of what the perfect programming language could be. It’s meant to illustrate how Domain-Specific Languages (DSLs) can be interesting.
Now, this is where we raise an interesting question. Which kind of programming language would be the most effective at writing DSLs?
John says the answer is to be found somewhere in the category of declarative languages. Functional languages are the most widely used class of declarative languages.
Or, in another words, who’s the one taken for a fool?
Obviously, the Chef is our programming language’s compiler. Right? After all, whatever was the level of programming language we used, we always wrote the instructions that it needed to produce what we wanted to eat.
Of course, the more succinctly the compiler allows us to express what we want, the better Chef it is.
But it’s us, isn’t it, who programmed the behavior of the Chef, in our object-oriented version? Yes, that’s right. Each one of Chef’s unimplemented methods would have to be written by us. And each one of Chef’s method calls, too, have been written by us. And in the non-object-oriented languages, although we didn’t explicitly mention Chef, we were still writing equivalent code. So we understand that whatever the paradigm and language we use to program a computer, we always end up writing everything the compiler doesn’t write for us.
In other words, we write the reciprocal of what the compiler writes:
ourCode + compilerCode = program
From this, we can derive:
ourCode = program - compilerCode compilerCode = program - ourCode
Now, if this doesn’t make perfect sense to you, there’s a good reason for it. Any successful high-schooler could beat my long-forgotten (poor) maths skills. Therefore, please correct me if I’m wrong in any way.
So in reality, the Chef is made up of two entities. First, human-generated code. Second, compiler-generated code. In other words, the compiler is made up of a human compiler and a machine compiler. Hey! don’t call me a compiler, that’s gross!
Now, trust John, we don’t want to be the Chef. That’s why we don’t use assembly language unless we really don’t have any other choice. It’s better to have a Chef at our service. And it’s even much, much better if our Chef is intelligent enough that we don’t have to go at lengths for him to understand what we truly want to eat.
Therefore, if we could find a Chef that allows us to express our desires five times more succinctly and clearly than the one that comes with Java, wouldn’t we be up for total enlightenment? We could either write five times faster the same programs as what we write today with Java, or take the same time that we’re used to, but to get programs five times more interesting.
Wow! I want to work with a better Chef. Don’t you?
Now, let’s put aside our own conclusions to hear other kinds of wisdom.
After all, who knows if I can prove Johm Smith’s existence?
We are now reaching one of the most interesting parts of this article.
Here are a few reasons why you might already know of John Backus:
Now, why do you think I’m mentioning the creator of the first high-level programming language?
Let’s hear Alex Aiken, from Stanford University:
He grew frustrated with the limitations of imperative-style programming and felt that there must be a better, higher-level way to develop software.
I certainly can relate to this, even 30 years later.
He eventually came to the realization that a core difficulty with FORTRAN-style languages was that programmers were reduced to reasoning about a long sequence of small state changes to understand their programs, and that a much simpler and more compositional method of reasoning would be to think in terms of the net effect of a computation, where the only thing that mattered was the mapping from function inputs to function outputs.
Functional programming! Or, should I say, function-level programming. Why? Because Backus’ ideas go even further than what we use now.
There were others, including Robin Milner [SML] and David Turner [SASL, Miranda ~> Haskell], who were thinking along similar lines at about the same time, but as yet there was no functional programming community.
One might want to include Lisp in this list, but should not overlook that Lisp, being a multi-paradigm programming language, embraced very openly many imperative programming techniques.
That all changed with Backus’ Turing Award lecture, published in 1978. His paper, “Can Programming Be Liberated from the von Neumann’s Style?”, did something very unusual, perhaps unique: it attacked his own previous work, his own legacy, and furthermore proposed an alternative that was far out of the mainstream. His paper had an electric effect; I think it is fair to say that it jump-started the field of functional programming. Certainly the number of researchers in the area, and the amount of funding for that research, increased dramatically in the years just after John received the Turing Award. His Turing Award remains his most cited paper, and indeed remains the most cited of all the Turing Award lectures in the 40 year history of that prize.
History now tells us that it’s in the 10 years following Backus’ lecture that researchers found most major ways of optimizing compilers to narrow the performance gap between functional languages and their imperative counterparts.
So, although John Backus is most remembered for his first works on FORTRAN, BNF and ALGOL, we should not forget why he devoted the last 15 years of his career to the field of functional programming.
Now, let’s see exactly how Backus described, in his famous lecture, imperative programming languages:
Conventional programming languages are growing ever more enormous, but not stronger. Inherent defects at the most basic level cause them to be both fat and weak:
- their primitive […] style of programming inherited from their common ancestor—the von Neumann computer,
- their close coupling of semantics to state transitions,
- their division of programming into a world of expressions and a world of statements,
- their inability to effectively use powerful combining forms for building new programs from existing ones, and
- their lack of useful mathematical properties for reasoning about programs.
Ouch! One could hardly find a harder critique of his own work!
If you would like to understand precisely what he means by these five arguments, you should definitely read at least the first 7 pages of his paper.
Each new language claims new and fashionable features, […] but the plain fact is that few languages make programming sufficiently cheaper or more reliable to justify the cost of producing and learning to use them. […] Since large increases in size bring only small increases in power, smaller, more elegant languages […] continue to be popular. But there is a desperate need for a powerful methodology to help us think about programs, and no conventional language even begins to meet that need. In fact, conventional languages create unnecessary confusion in the way we think about programs. […] Although I refer to conventional languages as “von Neumann languages” to take note of their origin and style, I do not, of course, blame the great mathematician for their complexity. In fact, some might say that I bear some responsibility for that problem.
See? And that’s not all.
Before discussing alternatives to von Neumann languages, let me remark that I regret the need for the above negative and not very precise discussion of these languages. But the complacent acceptance most of us give to these enormous, weak languages has puzzled and disturbed me for a long time. I am disturbed because that acceptance has consumed a vast effort toward making von Neumann languages fatter that might have been better spent in looking for new structures. For this reason I have tried to analyze some of the basic defects of conventional languages and show that those defects cannot be resolved unless we discover a new kind of language framework.
Reading Backus’ paper has been a profoundly enlightening experience for me.
I definitely suggest you schedule some time to read at least the first 12 pages of his lecture. The first 7 pages will shed out more light on what’s wrong with imperative languages; then, the next pages will explain how a language can be implemented on top of a very small number of sound and simple concepts that can be used and powerfully combined in all sorts of natural ways to describe computations.
John Backus has often been caught saying of himself that he’s lazy. It was one of his ways of explaining why he created high level languages. Nonetheless, the computer scientist the most remembered when it comes to laziness isn’t Backus, but David Turner, who brought us the concept of lazy evaluation. Incidentally, we also owe him a great deal for the concept of purity in functional programming.
Both the purity and laziness concepts are central to what Turner wrote in 1981-1982 in his paper called Recursion Equations As A Programming Language.
The importance of [functional] languages lies in the fact that they hold out the promise of being able to solve both of these problems at the same time [1: software costs, 2: taking advantage of a very large degree of concurrency in the machine]. In both cases the key step is the abolition of the assignment statement, and with it the notion of sequencing.
Ouch! Abolition of the assignment statement might sound incredibly alien to your ears, but please bear with him. It makes much more sense than it might appear to the functional programming neophyte. And by sequencing, David means how imperative languages oblige us to control explicitly, instead of implicitly, the order in which things are supposed to happen.
When we compare our programming languages with all previous mathematical notation, however, the differences are very striking. Mathematical notation has evolved over many centuries and obeys certain basic rules, which are common to every area of mathematics, and which give mathematical notation its deductive power.
First of all mathematics is static. There is no equivalent, in mathematics, of the programming language notion of a procedure which gives a different answer each time you call it. The mathematical idea of a function is a fixed table of input-output pairs. […] This is so even when mathematics is used to describe processes of change as in physics. […] Physics as a science became possible only because Newton showed us how to reduce dynamics to statics in this way.
Secondly, and relatedly, there is in mathematics a certain kind of consistency in the use of names […] Paradoxical though it may sound, in mathematics variables do not vary; they stand for a constant value throughout their scope.
These basic properties of mathematical notation have been termed by logicians, “referential transparency” […] An immediate consequence of referential transparency is that equality is substitutive – equal expressions are always and everywhere interchangeable. It is this which gives mathematical notation its deductive power.
Thus, we shouldn’t spit on the semantics of such a notation.
[…] In introducing the assignment statement, which can change the value of a variable in the middle of its scope, they have broken the basic ground-rules of mathematical notation. Instead of being referentially transparent, programming languages are referentially opaque. In fact, the things that we call “variables” […] are not really variables […] – in the last analysis they are names for registers in the store of a Von-Neumann computer.
It is because of this that it is so difficult to reason about programs. Since expressions can change their value through time, equality is not substitutive. Indeed, […] it does not even have to be true that an expression is equal to itself (because the presence of side effects may mean that evaluating the same expression twice in succession can produce two different answers)! In general it is not possible to reason about such programs on the basis of a static analysis of the program text – instead we have to think of the program dynamically and follow the detailed flow of control […].
Thus, to one equally used to imperative and declarative languages, it takes more time to read and understand properly imperative code. And that’s not all.
A number of studies carried out in the industry have shown that a given programmer tends to produce, per year, a relatively fixed number of lines of code […] and while the number of lines varies quite a lot from programmer to programmer, it is for a given individual largely independent of the language in which he is working (for example, it doesn’t seem to matter whether it is assembly code or PL/1).
The significance of this result is that it means that the most important single variable in determining software production costs, apart from the quality of the programmers, is the level of language at which they are working. […] programs written in FORTRAN are from five to ten times shorter than the equivalent assembly code. Other things being equal, then, the FORTRAN programmer is from five to ten times more productive than the assembly code programmer.
The same kind of rule applies to today’s mainstream languages.
Our problem today is that in the 25 years that have elapsed since the invention of FORTRAN we have failed to produce any further substantial improvement in this basic ratio of expressive power. If you compare a program written in a “modern” imperative language, such as PASCAL or ADA with its FORTRAN equivalent, you will not find it very much shorter (in fact it might even be longer, because of the extra declaratory information necessitated by the current fad for very restrictive forms of strong typing).
Since then, some things changed:
Taken together, these changes enabled us to create programs in some of our modern languages which are probably five to ten times shorter than they would have been in the C++ of 10 years ago. All of this being said, let’s mention that our imperative languages have continued to use this very restrictive form of strong typing that David wrote about. It’s only recently that some of our languages started to improve on this basic type checking by inferring some types; but even this is a far cry from the kind of type checking that exists in the functional breed of languages.
All of this being said, what needs to be emphasized is that even though our languages have evolved, they are still fundamentally oriented towards an imperative usage.
[…] We can give a general reason why the change from an imperative language to a descriptive one should lead to programs becoming so much shorter. Expressed at an appropriate level of abstraction […], an algorithm is a partially ordered set of computations – the partial ordering being imposed by the data dependencies.
In order to execute an algorithm on a Von-Neumann computer, however, we have to convert this to a total ordering (in any one of the many possible ways) and organise storage for the intermediate results. In an imperative language both of these tasks must be carried out explicitly by the programmer with the result that he has to specify a great deal of extra information.
In summary, a good case can be made out for saying that the fundamental cause of the software crisis is the imperative and machine oriented nature of our programming languages, and that to overcome it we have to abandon the use of side effects and programmer control of sequencing in favour of purely functional notation.
Radical, but true. There’s been a time when programmers felt garbage collection was an unacceptable way of managing memory. Nowadays, we wouldn’t want to part with such a facility unless it was of the utmost importance to do so. The same can be said of what I will call manual computation order management. It’s time for us let it loose in most of our code. Smart compilers can give us another big increase in productivity, if we will let them decide how to order our computations, within the limits of our very clear functional semantics.
The theoretical possibility of programming in a purely functional style has been known about for two decades – the obstacle to its use in practice has always been the difficulty of achieving acceptable efficiency in the use of existing hardware while using such techniques.
A few years after he wrote his paper, the technology used to create optimizing compilers for functional languages started to become competitive. Since then, the gap has been closed. That is, of course, if you compare those functional and imperative languages whose compiler is designed for performance.
More papers, more data
Formality
Provability
Value Transparency, not Opaqueness