You Can’t Dig Upwards
By Evan Miller
December 7, 2014
When I first learned to drive a car, shortly after turning 15, my mother insisted that I learn on my father’s Mazda Miata: a silver, two-door convertible with a soft-top roof.
The Miata was a hard car to drive; not only did the car lack automatic transmission, but with the top up, it was nearly impossible to see out the back of the car. The sun had so thoroughly browned and warped the plastic rear windshield that the soft top, when folded down, appeared primed and ready to produce panini.
Miatas weren’t particularly safe vehicles, nor did Mazda Motor Corporation try to pretend otherwise. It was an affordable car that was fun to drive, a sort of poor man’s BMW. I remember that my father’s Miata was so narrow, and so low to the ground, that when I rode with him on the Interstate, I would wonder if his car could actually fit underneath an adjacent semi truck without the truck driver noticing.
Our family also owned a minivan and an SUV, which, being larger, safer, and possessed of transparent rear windshields, you’d think a worried parent would prefer their teenage son to learn with. But not my mother.
“One day you’ll be at a party,” my mother explained to me, with the learned assuredness of a modern Nostradame. “There will be a car there that no one can drive because it has a stick shift. You need to be able to drive that car. And you need to learn now, or else you’ll never learn stick.”
The driver-incapacitating bacchanal that I was promised never materialized, but the passage of time proved out her second prognostication. When I learned to drive my father’s Miata, I was forced to learn stick shift.
For those of you who have not been inaugurated into the Holy Mysteries of Shifting Gears, learning stick is a fraught and laborious undertaking. You have to worry about tearing up the transmission, stalling out, rolling backwards on a hill. Make a wrong move, and the car will screech and lurch.
I have distinct memories of sitting there in the driveway, holding back tears as I tried for the thirtieth time to get that damned car into first gear. I also remember, on a separate occasion, the feeling of pure terror as a red Mercedes pulled up behind me on a steep hill.
Learning stick is not an enjoyable process, and I almost certainly wouldn’t have gone through with it if my mother hadn’t wielded such prophetic authority in our household. But with enough practice, I eventually got good enough to transport my father’s Miata down the road at exactly the speed limit without completely soaking the front seat in sweat and tears.
In the years since I learned how to drive, I’ve noticed an interesting pattern. I’ve noticed that we proud few — we battle-hardened stick drivers — tend to be much better drivers than those who rely on automatic transmission. We have more control over our cars, and a better mental model of how car engines work. Even if we’ve never seen the inside of a gearbox, we know what the Overdrive gear in automatic-transmission cars means, and how you can use it to save on gas. We know what those mysterious 1 and 2 labels below Drive represent, and why you engage their services when coasting downhill. At any moment, we probably have a better intuition for what your car is doing than you do.
In retrospect, learning stick shift was a prudent investment of time, even though I’ve never had to prove it to society by (for example) driving a stick-shift ambulance full of orphans while avoiding heavy gunfire. Driving stick is just a good skill to have. More people should have it, in my opinion.
The astute reader will have surmised by now that I am not telling you all this in order to establish my credentials as a driver of low-end convertibles, or to hinder the ineluctable onslaught of automatic transmission in the automobile industry. I bring it up because I see important parallels between the move to automatic transmission in cars and the rise of Python in the computing world.
Python is convenient, and in many ways, a great advance over the C programming language. However, just as teaching teenagers to drive automatic transmission is a practical guarantee that they’ll never learn stick, advising neophytes to learn Python is creating programmers who will never bother to learn how to code in C. And that, I believe, is a bad thing.
I am in the minority with this opinion, which is all the more reason I feel compelled to share my views here. Python recently surpassed Java as the most popular introductory CS language at universities. Eric Raymond, the Keeper of the Jargon File, and author of the de facto-definitive “How To Be A Hacker” document, recommends Python as a first language. So does spell-checking extraordinaire Peter Norvig, in his famous “Teach Yourself Programming in Ten Years” piece. Paul Graham has even given Python its own paradox.
These days, if you see an advertisement along the lines of Learn to Program in X Weeks, the advertiser is almost certainly promising to teach you Python.
In spite of the superficial appeal of Python — its syntax, interactivity, library support, vast community, and superior documentation — there’s a serious problem with using Python as a first programming language. Programmers who only know Python lack a proper mental model of how computers work. They tend to perceive C as a kind of arcane skill that should be looked upon with a mixture of awe and contempt. Awe because they are told C is difficult, and contempt because they are told that the hardships imposed by C are the obsolete legacy of a more frugal era. (Summarized by the new truism, “human time is more valuable than computer time.”)
But Python and C are not simple substitutes, with one being a slower, more convenient form of the other. Nor is Python a pure advancement in technology over C. It’s better to think of C and Python as occupying separate, parallel planes of reality, with C on the plane closest to the machine.
Paul Graham has a nice metaphor (doesn’t he always?) about the power of programming languages, which he imagines existing along some kind of vertical continuum. Languages with few features reside at the bottom of the continuum, and languages with many features occupy the top. Somewhere in the middle is a hypothetical language called Blub, which, for all purposes, is C. He goes on to argue that users of Blub will scoff at all the languages below Blub on the continuum — Cobol, machine language, etc. — and simply fail to understand the features of languages that lie above.
It’s a compelling image, but it’s worth examining the metaphor from the opposite angle. Python may lie above C, and Lisp may be the top turtle in the tower of abstraction, but learning Python or Lisp doesn’t actually yield much insight into how computers work at the bottom of the beanstalk. That is, despite the visual implication that Lisp programmers are all-seeing gods, in the hierarchy of programming languages, looking “down” isn’t any easier than looking “up”.
In fact, I’d say it’s harder to look downward than to look up. A pure-Lisp person peering down at C sees only pesky syntax, unfamiliar functions, and cryptic comments about cache lines. But a C programmer gazing upward sees a bunch of things implemented in C.
The appeal of Python is that it occupies a kind of upper-middle position on the continuum. It’s more abstract than C, and thus a more productive experience, but it’s not so abstract that it will make your head burst forth with pink spirals of metaprogramming and recursion.
The trouble is that in the same way Lisp looks weird and unnecessary to the average Blub programmer, the C language looks weird and unnecessary to the average Python programmer. What are these pointers people keep talking about? It sounds hard (the Python programmer muses), and if I’ve gotten this far without them, I probably don’t need them.
So Python programmers, as a rule, never get around to learning C.
It is certainly possible to have a long and productive career in software without knowing C. That is not at issue. However, there are a couple of pernicious effects engendered by the C-free programming culture that is emerging as a result of Python-first education.
One is that fewer and fewer people are capable of writing extremely good software. To write extremely good software —whether it’s desktop, mobile, extraplanetary —you need a firm understanding of how computers and operating systems work at a low level. How many data structures fit into the CPU cache? Why is this code slow when the built-in profiler says it is fast? When I ask the operating system to do something, what is it actually doing? I suppose you can acquire this knowledge with a strict reading regimen, but to develop the necessary intuition, there’s no substitute for writing and optimizing a lot of C code.
You might argue that at this particular historical moment, more software is more valuable than better software. From an economic point of view, that may be. Technology is fast-moving, and it’s pointless to build an extremely good piece of software that no one wants. However, the culture of “just use Python,” “computers are fast enough,” and “throw hardware at the problem” is creating a deeper systemic problem: as a result of teaching Python first, and denigrating C, fewer and fewer smart people will want to become professional programmers.
Put away your shivs for a moment and hear my argument.
I believe there are two separate mechanisms that make Python unappealing to smart people who are first learning to program. The first is that learning Python is not a very difficult undertaking. The core concepts of the language present no real challenge for smart people, and consequently does not demand intense engagement. Learning Python is, well, kind of rote and boring. It feels like a school that is lacking in advanced classes.
The second mechanism is illustrated by a friend of mine who recently enrolled in a Python workshop. She told me she enjoyed learning about loops and functions. That was well and good — loops and functions are small, understandable, self-contained — but then the instructor decided to show how powerful Python was, and spent the rest of the workshop creating some needless program or other with the Python Twitter API.
Using software libraries is an important skill, as any professional software developer will attest. However, if software libraries are introduced too soon, they end up forming a wall of abstraction that leave smart people dumbfounded. Here my friend was, thinking she was on the cusp of learning the secret rules that govern computers; but instead, the instructor simply swapped out monolith of user-facing software with the equally opaque abstraction of the client library. Without mastering the language first — and learning how to read the library source code — my friend didn’t have the tools she needed to dig deeper and figure out what was going on at the bottom of the call chain.
You could argue that she just happened to take a poorly designed workshop. But there’s a larger point to be made here: without mastering C first, programmers lack the tools to dig deeper and figure out “what’s going on” in the system that they’re using. If you’re a smart and curious Python programmer, you’ll quickly hit a hard layer of dirt called C. Beneath that layer, you’ll be told, abide dragons, bones, and debuggers. As a consequence, unless you are brave enough to ignore the admonitions and undertake to learn C on your own initiative, you’ll never experience the pleasure of digging into something as deep as you can simply for the sake of curiosity.
In contrast, the C programmer who’s trying to understand a piece of C code will, in the worst case, peruse a wealth of header files, man pages, and source code; but seldom a strange pocket of Lisp or Python. You dig down, not up. That’s just how computers work.
Programmers tend to be motivated either by a desire to get the job done, or a desire to understand how things work. There is no question Python is a great for getting the job done. The popularity of Python among startups and in the sciences (including data science) attest to this. If you’re only going to learn one programming language in your lifetime, Python is a safe, productive, and reasonable choice.
But Python is a very poor language for anyone who wants to understand how computers work. This is a problem, because in the long run, the very best programmers tend to be the people who are motivated by a thirst for knowledge. They might not be the best employees in the short term — reading books on the job and implementing pointless optimizations — but after they’ve gone down enough rabbit holes, they’ll be able to draw you a map of the world that lies beneath the grass.
And so that, I think, is the real problem with C-free programming — as well as with the larger idea that introductory programming languages should be “expressive” and “good for beginners”. The programming profession is losing out on the very people who would become the best programmers.
Smart, curious people who learn Python first — and who are told that Python is “all anyone needs these days” — will tend to lose interest in programming, either because it won’t be challenging enough to sustain their interest, or because they will hit a layer of C code and won’t know what to do next. I should emphasize that this logic does not apply to people who are motivated to program as part of their job (scientists, journalists, web designers, etc.); it applies, rather, to people who are curious about programming and think they might become extremely good at it one day, but who aren’t sure what they will want to use it for.
It’s worth addressing, briefly, why universities have chosen to teach Python (and before that, Java) if it doesn’t produce extremely good programmers. The foremost reason is that most universities are not in the business of producing extremely good programmers. Most computer science departments want to maximize the number of graduates who will either go to grad school or get good-paying programming jobs. Being an extremely good programmer is not really required for either of those.
From the perspective of most CS educators, a programming language is just something that gets in the way of teaching CS fundamentals: algorithms, data structures, and so on. Python is a good balance between something that is easy to teach (which makes professors happy) and something that private industry uses (which imbues department chairs with the spirit of Christmas). Add to that the popularity of Python in the sciences, the surfeit of online learning materials, and the oft-repeated, rarely questioned assertion that Python is a good teaching language, and what could possibly go wrong?
Before I learned how to drive, I remember my father making a remark about a sporty BMW convertible that he saw parked on the street. If you can afford a car like that, he said, why the hell would you buy an automatic?
I didn’t understand the comment at the time. I had assumed that automatic transmission, like power windows and anti-lock brakes, was a basic upgrade that all but the most dedicated of cheapskates were willing to pay for. But after my several months of driving lessons with my father, I began to understand what he meant: once you develop a knack for it, driving stick shift is actually a lot of fun. You feel closer to the car, closer to the road. There’s a certain joy in mastering a machine, down-shifting into bends and turns, engaging the clutch at just the right moment, getting to know the character of each gear in the transmission. It is a distinctly human pleasure, akin to the assembly programmer’s delight in pushing a computer to its limit.
I get the sense that Python-first programmers are missing out on a lot of the fun of programming. There’s a mistaken notion that programming is most enjoyable when you’re doing a diverse array of things using the smallest amount of code, leveraging client libraries and so forth. This is true for people trying to get something done to meet a deadline; but for people motivated by learning — and who aspire to be extremely good programmers — it’s actually more fun to create programs that do less work, but in a more intellectually stimulating way.
For someone who has never programmed a computer before, the C language presents a healthy challenge. Pointers and pointer arithmetic don’t have close real-world analogues, and as you are learning the language, you’re likely to encounter seemingly meaningless error messages and non-deterministic crashes. If your arithmetic is bad, you risk corrupting the call stack; forget a simple bounds-check, and the heap goes haywire. At times, you will want to punch the monitor and weep on your keyboard.
But there’s an important payoff: C forces you to build a mental model of what the computer is actually doing when you run your programs, much like a teenager figuring out how the gear mechanism works by playing around with the clutch. As you ask why and keep digging for answers, your mental model will grow to encompass the process model, the CPU architecture, the memory hierarchy, the operating system, and so forth. It’s that mental model — rather than the C language itself — that will enable you to poke through the abstractions created by others, and write programs you never thought possible.
So if you want to write extremely good software — in any language — you should ignore all the advice to learn Python, or heaven forbid, Lisp, and instead begin your journey with C. The view from the top is nice, but if you aspire to climb mountains, there’s no better place to start than at the bottom.
Correction: A previous version of this essay misrepresented Overdrive. It's been so long since I've driven automatic that I forgot what it was for! Apologies to anyone who attempted to use it to pass a truck.
You’re reading evanmiller.org, a random collection of math, tech, and musings. If you liked this you might also enjoy:
Want to look for statistical patterns in your MySQL, PostgreSQL, or SQLite database? My desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!
Statistics the Mac way