Highlights: Hackers & Painters, by Paul Graham
Raw Generation of Kindle Notes, up for Summary + Review
If a fairly good hacker is worth $80,000 a year at a big company, then a smart hacker working very hard without any corporate bullshit to slow him down should be able to do work worth about $3 million a year. Like all back-of-the-envelope calculations, this
They don’t change the laws of wealth creation. They just represent a point at the far end of the curve. There is a conservation law at work here: if you want to make a million dollars, you have to endure a million dollars’ worth of pain. For example, one way to make a million dollars would be to work for the Post Office your whole life, and save every penny of your salary. Imagine the stress of working for the Post Office for fifty years. In a startup you compress all this stress into three or four years. You do tend to get a certain bulk discount if you buy the economy-size pain, but you can’t evade the fundamental conservation law. If starting a startup were easy, everyone would do it.
This is why so many of the best programmers are libertarians. In our world, you sink or swim, and there are no excuses. When those far removed from the creation of wealth—undergraduates, reporters, politicians—hear that the richest 5% of the people have half the total wealth, they tend to think injustice! An experienced programmer would be more likely to think is that all? The top 5% of programmers probably write 99% of the good software.
In industrialized countries, people belong to one institution or another at least until their twenties. After all those years you get used to the idea of belonging to a group of people who all get up in the morning, go to some set of buildings, and do things that they do not, ordinarily, enjoy doing. Belonging to such a group becomes part of your identity: name, age, role, institution. If you have to introduce yourself, or someone else describes you, it will be as something like, John Smith, age 10, a student at such and such elementary school, or John Smith, age 20, a student at such and such college.
If wealth means what people want, companies that move things also create wealth. Ditto for many other kinds of companies that don’t make anything physical. Nearly all companies exist to do something people want.
But here there is another layer that tends to obscure the underlying reality. In a company, the work you do is averaged together with a lot of other people’s. You may not even be aware you’re doing something people want. Your contribution may be indirect. But the company as a whole must be giving people something they want, or they won’t make any money. And if they are paying you x dollars a year, then on average you must be contributing at least x dollars a year worth of work, or the company will be spending more than it makes, and will go out of business.
A more direct way to put it would be: you need to start doing something people want. You don’t need to join a company to do that. All a company is is a group of people working together to do something people want. It’s doing something people want that matters, not joining the group.
Companies are not set up to reward people who want to do this. You can’t go to your boss and say, I’d like to start working ten times as hard, so will you please pay me ten times as much? For one thing, the official fiction is that you are already working as hard as you can. But a more serious problem is that the company has no way of measuring the value of your work.
To get rich you need to get yourself in a situation with two things, measurement and leverage. You need to be in a position where your performance can be measured, or there is no way to get paid more by doing more. And you have to have leverage, in the sense that the decisions you make have a big effect.
An example of a job with both measurement and leverage would be lead actor in a movie. Your performance can be measured in the gross of the movie. And you have leverage in the sense that your performance can make or break it. CEOs also have both measurement and
If you can’t measure the value of the work done by individual employees, you can get close. You can measure the value of the work done by small groups.
That’s the real point of startups. Ideally, you are getting together with a group of other people who also want to work a lot harder, and get paid a lot more, than they would in a big company. And because startups tend to get founded by self-selecting groups of ambitious people who already know one another (at least by reputation), the level of measurement is more precise than you get from smallness alone. A startup is not merely ten people, but ten people like you. Steve Jobs
Startups offer anyone a way to be in a situation with measurement and leverage. They allow measurement because they’re small, and they offer leverage because they make money by inventing new technology.
What is technology? It’s technique. It’s the way we all do things. And when you discover a new way to do things, its value is multiplied by all the people who use it. It is the proverbial fishing rod, rather than the fish. That’s the difference between a startup and a restaurant or a barber shop. You fry eggs or cut hair one customer at a time. Whereas if you solve a technical problem that a lot of people care about, you help everyone who uses your solution. That’s leverage.
Big companies can develop technology. They just can’t do it quickly. Their size makes them slow and prevents them from rewarding employees for the extraordinary effort required. So in practice big companies only get to develop technology in fields where large capital requirements prevent startups from competing with them, like microprocessors, power plants, or passenger aircraft. And even in those fields they depend heavily on startups for components and ideas. It’s obvious
McDonald’s, for example, grew big by designing a system, the McDonald’s franchise, that could then be reproduced at will all over the face of the earth. A McDonald’s franchise is controlled by rules so precise that it is practically a piece of software. Write once, run everywhere. Ditto for Wal-Mart. Sam Walton got rich not by being a retailer, but by designing a new kind of store.
What this meant in practice was that we deliberately sought hard problems. If there were two features we could add to our software, both equally valuable in proportion to their difficulty, we’d always take the harder one. Not just because it was more valuable, but because it was harder. We delighted in forcing bigger, slower competitors to follow us over difficult ground. Like guerillas, startups prefer the difficult terrain of the mountains, where the troops of the central government can’t follow. I can remember times when we were just exhausted after wrestling all day with some horrible technical problem. And I’d be delighted, because something that was hard
There is, as I said before, a large random multiplier in the success of any company. So in practice the deal is not that you’re 30 times as productive and get paid 30 times as much. It is that you’re 30 times as productive, and get paid between zero and a thousand times as much. If the mean is 30x, the median is probably zero. Most startups tank, and not just the dog food portals we all heard about during the Internet Bubble. It’s common for a startup to be developing a genuinely good product, take slightly too long to do it, run out of money, and have to shut down.
Startups, like mosquitos, tend to be an all-or-nothing proposition. And you don’t generally know which of the two you’re going to get till the last minute. Via web came close to tanking several times. Our trajectory was like a sine wave. Fortunately we got bought at the top of the cycle, but it was damned close. While we were visiting Yahoo in California to talk about selling the company to them, we had to borrow a conference room to reassure an investor who was about to back out of a new round of funding that we needed to stay alive.
The closest you can get is by selling your startup in the early stages, giving up upside (and risk) for a smaller but guaranteed payoff. We had a chance to do this, and stupidly, as we then thought, let it slip by. After that we became comically eager to sell. For the next year or so, if anyone expressed the slightest curiousity about Via web we would try to sell them the company. But there were no takers, so we had to keep going.
think it’s a good idea to get bought, if you can. Running a business is different from growing one. It is just as well to let a big company take over once you reach cruising altitude. It’s also financially wiser, because selling allows you to diversify. What would you think of a financial advisor who put all his client’s assets into one volatile stock?
How do you get bought? Mostly by doing the same things you’d do if you didn’t intend to sell the company. Being profitable, for example. But getting bought is also an art in its own right, and one that we spent a lot of time trying to master.
In both cases, what it all comes down to is users. You’d think that a company about to buy you would do a lot of research and decide for themselves how valuable your technology was. Not at all. What they go by is the number of users you have.
In effect, acquirers assume the customers know who has the best technology. And this is not as stupid as it sounds. Users are the only real proof that you’ve created wealth. Wealth is what people want, and if people aren’t using your software, maybe it’s not just because you’re bad at marketing. Maybe
Venture capitalists have a list of danger signs to watch out for. Near the top is the company run by techno-weenies who are obsessed with solving interesting technical problems, instead of making users happy. In a startup, you’re not just trying to solve problems. You’re trying to solve problems that users care about.
Number of users may not be the perfect test, but it will be very close. It’s what acquirers care about. It’s what revenues depend on. It’s what makes competitors unhappy. It’s what impresses reporters, and potential new users. Certainly it’s a better test than your a priori notions of what problems are important to solve, no matter how technically adept you are.
rightly—taking a long time to develop a product. Now we can recognize this as something hackers already know to avoid: premature optimization. Get a version 1.0 out there as soon as you can. Until you have some users to measure, you’re optimizing based on guesses.
rightly—taking a long time to develop a product. Now we can recognize this as something hackers already know to avoid: premature optimization. Get a version
rightly—taking a long time to develop a product. Now we can recognize this as something hackers already know to avoid: premature optimization. Get a version 1.0 out there as soon as you can. Until you have some users to measure, you’re optimizing based on guesses.
Remember what a startup is, economically: a way of saying, I want to work faster. Instead of accumulating money slowly by being paid a regular wage for fifty years, I want to get it over with as soon as possible. So governments that forbid you to accumulate wealth are in effect decreeing that you work slowly. They’re willing to let you earn $3 million over fifty years, but they’re not willing to let you work so hard that you can do it in two. They are like the corporate boss that you can’t go to and say, I want to work ten times as hard, so please pay me ten times a much. Except this is not a boss you can escape by starting your own company.
Without the incentive of wealth, no one wants to do it. Engineers will work on sexy projects like fighter planes and moon rockets for ordinary salaries, but more mundane technologies like light bulbs or semiconductors have to be developed by entrepreneurs.
No one complains when a few people surpass all the rest at playing chess or writing novels, but when a few people make more money than the rest, we get editorials saying this is wrong.
I think there are three reasons we treat making money as different: the misleading model of wealth we learn as children; the disreputable way in which, till recently, most fortunes were accumulated; and the worry that great variations in income are somehow bad for society. As far as I can tell, the first is mistaken, the second outdated, and the third empirically false. Could it be that, in a modern democracy, variation in income is actually a sign of health?
Because kids are unable to create wealth, whatever they have has to be given to them. And when wealth is something you’re given, then of course it seems that it should be distributed equally.2 As in most families it is. The kids see to that. “Unfair,” they cry, when one sibling gets more than another.
You get paid by doing or making something people want, and those who make more money are often simply better at doing what people want. Top actors make a lot more money than B-list actors. The B-list actors might be almost as charismatic, but when people go to the theater and look at the list of movies playing, they want that extra oomph that the big stars have.
Such tricks account for some variation in wealth, and indeed for some of the biggest individual fortunes, but they are not the root cause of variation in income. The root cause of variation in income, as Occam’s Razor implies, is the same as the root cause of variation in every other human skill.
and baseball players 72 times as much. Editorials quote this kind of statistic with horror. But I have no trouble imagining that one person could be 100 times as productive as another. In ancient Rome the price of slaves varied by a factor of 50 depending on their skills.4 And that’s without considering motivation, or the extra leverage in productivity that you can get from modern technology.
How much someone’s work is worth is not a policy question. It’s something the market already determines.
And regardless of the case with CEOs, it’s hard to see how anyone could argue that the salaries of professional basketball players don’t reflect supply and demand.
saying? In a free market, prices are determined by what buyers want. People like baseball more than poetry, so baseball players make more than poets. To say that a certain kind of work is underpaid is thus identical with saying that people want the wrong things.
Then you’re saying that it’s unjust that people want the wrong things. It’s lamentable that people prefer reality TV and corndogs to Shakespeare and steamed vegetables, but unjust? That seems like saying that blue is heavy, or that up is circular. The appearance of word “unjust” here is the unmistakable spectral signature of the Daddy Model. Why else would this idea occur in this odd context? Whereas if the speaker were still operating on the Daddy Model, and saw wealth as something that flowed from a common source and had to be shared out, rather than something generated by doing what other people wanted, this is exactly what you’d get on noticing that some people made much more than others. When we talk about “unequal distribution of income,” we should also ask, where does that income come from?8 Who made the wealth it represents? Because to the extent that income varies simply according to how much wealth people create, the
Will technology increase the gap between rich and poor? It will certainly increase the gap between the productive and the unproductive. That’s the whole point of technology. With a tractor an energetic farmer could plow six times as much land in a day as he could with a team of horses. But only if he mastered a new kind of farming.
Indeed, as with expensive cars, if you’re determined to spend a lot of money on a watch, you have to put up with some inconvenience to do it: as well as keeping worse time, mechanical watches have to be wound. The only thing technology can’t cheapen is brand. Which is precisely why we hear ever more about it. Brand is the residue left as the substantive differences between rich and poor evaporate. But what label you have on your stuff is a much smaller matter than having it versus not having it. In 1900, if you kept a carriage, no one asked what year or brand it was. If you had one, you were rich. And if you weren’t rich, you took the omnibus or walked. Now even the poorest Americans drive cars, and it is only because we’re so well trained by advertising that we can even recognize the especially expensive ones.
The houses are made using the same construction techniques and contain much the same objects. It’s inconvenient to do something expensive and custom.
Materially and socially, technology seems to be decreasing the gap between the rich and the poor, not increasing it. If Lenin walked around the offices of a company like Yahoo or Intel or Cisco, he’d think communism had won. Everyone would be wearing the same clothes, have the same kind of office (or rather, cubicle) with the same furnishings, and address one another by their first names instead of by honorifics.
Indeed, it may even be false, in industrial democracies. In a society of serfs and warlords, certainly, variation in income is a sign of an underlying problem. But serfdom is not the only cause of variation in income. A 747 pilot doesn’t make 40 times as much as a checkout clerk because he is a warlord who somehow holds her in thrall. His skills are simply much more valuable.
propose an alternative idea: that in a modern society, increasing variation in income is a sign of health. Technology seems to increase the variation in productivity at faster than linear rates. If we don’t see corresponding variation in income, there are three possible explanations: (a) that technical innovation has
The only option, if you’re going to have an increasingly prosperous society without increasing variation in income, seems to be ©, that people will create a lot of wealth without being paid
All the un fun kinds of wealth creation slow dramatically in a society that confiscates private fortunes. We can confirm this empirically. Suppose you hear a strange noise that you think may be due to a nearby fan. You turn the fan off, and the noise stops. You turn the fan back on, and the noise starts again. Off, quiet. On, noise. In the absence of other information, it would seem the noise is caused by the
you suppress variations in income, whether by stealing private fortunes, as feudal rulers used to do, or by taxing them away, as some modern governments have done, the result always seems to be the same. Society as a whole ends up poorer.
You need rich people in your society not so much because in spending their money they create jobs, but because of what they have to do to get rich. I’m not talking about the trickle-down effect here. I’m not saying that if you let Henry Ford get rich, he’ll hire you as a waiter at his next party. I’m saying that he’ll make you a tractor to replace your horse.
The more convenient language that you feed to the compiler is called a high-level language. It lets you build your programs out of powerful commands, like “do something n times” instead of wimpy ones like “add two numbers.” When you get to build your programs out of bigger concepts, you don’t need to use as many of them. Written in our imaginary high-level language, our program is only a fifth as long. And if there were a mistake in it, it would be easy to see.
Compilers aren’t the only way to implement high-level languages. You could also use an interpreter, which examines your program one piece at a time and executes the corresponding machine language commands, instead of translating the whole thing into machine language and running that.
The high-level language that you feed to the compiler is also known as source code, and the machine language translation it generates is called object code. When you buy commercial software, you usually only get the object code. (Object code is so hard to read that it is effectively encrypted, thus protecting the company’s trade secrets.) But lately there is an alternative approach: open source software, where you get the source code as well, and are free to modify it if you want.
The average end user may not need the source code of their word processor, but when you really need reliability, there are solid engineering reasons for insisting on open source.
So which one do you use? Ah, well, there is a great deal of disagreement about that. Part of the problem is that if you use a language for long enough, you start to think in it. So any language that’s substantially different feels terribly awkward, even if there’s nothing intrinsically wrong with it. Inexperienced programmers’ judgements about the relative merits of programming languages are often skewed by this effect.
Just as high-level languages are more abstract than assembly language, some high-level languages are more abstract than others. For example, C is quite low-level, almost a portable assembly
Prolog, for example. It has fabulously powerful abstractions for solving about 2% of problems, and the rest of the time you’re bending over backward to misuse these abstractions to write de facto Pascal programs.
The biggest debate in language design is probably the one between Those who think that a language should prevent programmers from doing stupid things, and those who think programmers should be allowed to do whatever they want.
Partisans of permissive languages ridicule the other sort as “B&D” (bondage and discipline) languages, with the rather impudent implication that those who like to program in them are bottoms. I don’t know what the other side call languages like Perl. Perhaps they are not the sort of people to make up amusing names for the opposition. The debate resolves
One of the more active questions at the moment is static versus dynamic typing. In a statically-typed language, you have to know the kind of values each variable can have at the time you write the program. With dynamic typing, you can set any variable to any value,
The disadvantage, critics would counter, is that adding things without looking at what was already there tends to produce the same results in programs that it does in buildings.
With typing you have to choose one or the other. But the object-orientedness of a language is a matter of degree. Indeed, there are two senses of object-oriented: some languages are object-oriented in the sense that they let you program in that style, and others in the sense that they force you to.
undergrads who believe they have to learn it to get a job. When I say Java won’t turn out to be a successful language, I mean something more specific: that Java will turn out to be an evolutionary dead-end, like Cobol.
because the space of possibilities is smaller, and partly because mutations are not random. Language designers deliberately incorporate ideas from other languages.
Convergence is more likely for languages partly because the space of possibilities is smaller, and partly because mutations are not random. Language designers deliberately incorporate ideas from other languages.
Any programming language can be divided into two parts: some set of fundamental operators that play the role of axioms, and the rest of the language, which could in principle be written in terms of these fundamental operators.
have a hunch that the main branches of the evolutionary tree pass through the languages that have the smallest, cleanest cores. The more of a language you can write in itself, the better.
There hasn’t been a lot of progress in that department so far. My guess is that a hundred years from now people will still tell computers what to do using programs we would recognize as such. There may be tasks that we solve now by writing programs and that in a hundred years you won’t have to write programs to solve, but I think there will still be a good deal of programming of the type we do today.
Languages evolve slowly because they’re not really technologies. Languages are notation. A program is a formal description of the problem you want a computer to solve for you. So the rate of evolution in programming languages is more like the rate of evolution in mathematical notation than, say, transportation or communications. Mathematical notation does evolve, but not with the giant leaps you see in technology.
That’s kind of hard to imagine. And indeed, the most likely prediction in the speed department may be that Moore’s Law will stop working. Anything that’s supposed to double every eighteen months seems likely to run up against some kind of fundamental limit eventually. But I have no trouble believing that computers will be very much faster. Even if they only end up being a paltry million times faster, that should change the ground rules for programming languages substantially. Among other things, there will be more room for what would now be considered slow languages, meaning languages that don’t yield very efficient code.
And there is another class of problems that inherently have an unlimited capacity to soak up cycles: image rendering, cryptography, simulations. If some applications can be increasingly inefficient while others continue to demand all the speed the hardware can deliver, faster computers will mean that languages have to cover an ever wider range of efficiencies.
People thirty years ago would be astonished at how casually we make long distance phone calls.
can already tell you what’s going to happen to all those extra cycles that faster hardware is going to give us in the next hundred years. They’re nearly all going to be wasted.
There’s good waste, and bad waste. I’m interested in good waste—the kind where, by spending more, we can get simpler designs. How will we take advantage of the opportunities to waste cycles that we’ll get from new, faster hardware? The desire for speed is so deeply ingrained in us, with our puny computers, that it will take a conscious effort to overcome it. In language design, we should be consciously seeking out situations where we can trade efficiency for even the smallest increase in convenience.
Semantically, strings are more or less a subset of lists in which the elements are characters. So why do you need a separate data type? You don’t, really. Strings only exist for efficiency. But it’s lame to clutter up the semantics of a language with hacks to make programs run faster. Having strings in a language seems to be a case of premature optimization.
think of the core of a language as a set of axioms, surely it’s gross to have additional axioms that add no expressive power, simply for the sake of efficiency. Efficiency is important, but I don’t think that’s the right way to get it. The right way to solve that problem is to separate the meaning of a program from the implementation details. Instead of having both lists and strings, have just lists, with some way to give the compiler optimization advice that will allow it to lay out strings as contiguous bytes if necessary.1
The word “essay” comes from the French verb “essayer,” which means “to try.” An essay, in the original sense, is something you write to try to figure something out. This happens in software too. I think some of the best programs were essays, in the sense that the authors didn’t know when they started exactly what they were trying to write.
Inefficient software isn’t gross. What’s gross is a language that makes programmers do needless work. Wasting programmer time is the true inefficiency, not wasting machine time. This will become ever more clear as computers get faster. I think getting rid of strings is already something we could bear to think about. We did it in Arc, and it seems to be a win; some operations that would be awkward to describe as regular expressions can be described easily as recursive functions.
we get rid of arrays, for example? After all, they’re just a subset of hash tables where the keys are vectors of integers. Will we replace hash tables themselves with lists?
Logically, you don’t need to have a separate notion of numbers, because you can represent them as lists: the integer n could be represented as a list of n elements. You can do math this way. It’s just unbearably inefficient.
layers of software between the application and the hardware. This too is a trend we see happening already: many recent languages are compiled into byte code. Bill Woods once told me that, as a rule of thumb, each layer of interpretation costs a factor of ten in speed. This extra cost buys you flexibility.
The more of your application you can push down into a language for writing that type of application, the more of your software will be reusable. Somehow the idea of reusability got attached to object-oriented programming in the 1980s, and no amount of evidence to the contrary seems to be able to shake it free. But although some object-oriented software is reusable, what makes it reusable is its bottom-upness,
Consider libraries: they’re reusable because they’re language, whether they’re written in an object-oriented style or not. I don’t predict the demise of object-oriented programming, by the way. Though I don’t think it has much to offer good programmers, except in certain specialized domains, it is irresistible to large organizations. Object-oriented programming offers a sustainable way to write spaghetti code. It lets you accrete programs as a series of patches. Large organizations always tend to develop software this way, and I expect this to be as true in a hundred years as it is today. As long as we’re talking about the future, we had better talk about parallel computation, because that’s where this idea seems to live. At any given time, it always seems to be something that’s going to happen in the future.
in parallel. And this will, like asking for specific implementations of data structures, be something that you do fairly late in the life of a program, when you try to optimize it. Version 1s will ordinarily ignore any advantages to be got from parallel computation, just as they will ignore advantages to be got from specific representations of data.
It’s not true that those who can’t do, teach (some of the best hackers I know are professors), but it is true that there are a lot of things that those who teach can’t do. Research imposes constraining caste restrictions. In any academic field, there are topics that are ok to work on and others
It’s not true that those who can’t do, teach (some of the best hackers I know are professors), but it is true that there are a lot of things that those who teach can’t do. Research imposes constraining caste restrictions. In any academic field, there are topics that are ok to work on and others that aren’t. Unfortunately the distinction between acceptable and forbidden topics is usually based on how intellectual the work sounds when described in research papers, rather than how important it is for getting good results. The extreme case is probably literature; people studying literature rarely say anything that would be of the slightest use to those producing it. Though
The trend is not merely toward languages being developed as open source projects rather than “research,” but toward languages being designed by the application programmers who need to use them, rather than by compiler writers. This seems a good trend and I expect it to continue.
You’d think it would be obvious to creatures as lazy as us how to express a program with the least effort. In fact, our ideas about what’s possible tend to be so limited by whatever language we think in that easier formulations of programs seem very surprising. They’re something you have to
Not the length in characters, of course, but the length in distinct syntactic elements—basically, the size of the parse tree. It may not be quite true that the shortest program is the least work to write, but it’s close enough that you’re better off aiming for the solid target of brevity than the fuzzy, nearby one of least work. Then the algorithm for language design becomes: look at a program and ask, is there a shorter way to write this?
assume infrastructure that didn’t exist in 1960. For example, a language in which indentation is significant, like Python, would not work very well on printer terminals. But putting such problems aside— assuming, for example, that programs were all just written on paper—would programmers of the 1960s have liked writing programs in the languages we use now? I think so. Some of the less imaginative ones, who had artifacts of early languages built into their ideas of what a program was, might have had trouble. (How can you manipulate data without doing pointer arithmetic? How can you implement flowcharts without gotos?) But I think the smartest programmers would have had no trouble making the most of present-day languages, if they’d had them.
Eric Raymond has written an essay called “How to Become a Hacker,” and in it, among other things, he tells would-be hackers what languages they should learn. He suggests starting with Python and Java, because they are easy to learn. The serious hacker will also want to learn C, in order to hack Unix, and Perl for system administration and CGI scripts. Finally, the truly serious hacker should consider learning Lisp:
Why not? Programming languages are just tools, after all. If Lisp really does yield better programs, you should use it. And if it doesn’t, then who needs
This is especially true in a startup. In a big company, you can do what all the other big companies are doing. But a startup can’t do what all the other startups do. I don’t think a lot of people realize this, even in startups.
We didn’t know anything about marketing, or hiring people, or raising money, or getting customers. Neither of us had ever even had what you would call a real job. The only thing we were good at was writing software. We hoped that would save us. Any advantage we could get in the software department, we would take. So you could say that using Lisp was an experiment. Our hypothesis was that if we wrote our software in Lisp, we’d be able to get features done faster than our competitors, and also to do things in our software that they couldn’t do. And because Lisp was so high-level, we wouldn’t need a big development team, so our costs would be lower. If this were so, we could offer a better product for less money, and still make a profit.
And the reason everyone doesn’t use it is that programming languages are not merely technologies, but habits of mind as well, and nothing changes slower. Of course, both these answers need explaining. I’ll begin with a shockingly controversial statement: programming languages vary in power.
What’s less often understood is that there is a more general principle here: that if you have a choice of several languages, it is, all other things being equal, a mistake to program in anything but the most powerful one.3
When we switch to the point of view of a programmer using any of the languages higher up the power continuum, however, we find that he in turn looks down upon Blub. How can you get anything done in Blub? It doesn’t even have y. By induction, the only programmers in a position to see all the differences in power between the various languages are those who understand the most powerful one. (This is probably what Eric Raymond meant about Lisp making you a better programmer.) You
What I will say is that I think Lisp is at the top. And to support this claim I’ll tell you about one of the things I find missing when I look at the other four languages. How can you get anything done in them, I think, without macros?
You write programs in the parse trees that get generated within the compiler when other languages are parsed. But these parse trees are fully accessible to your programs. You can write programs that manipulate them. In Lisp, these programs are called macros. They are programs that write programs.
suspicious person might begin to wonder if there was some correlation here. A big chunk of our code was doing things that are hard to do in other languages. The resulting software did things our competitors’ software couldn’t do. Maybe there was some kind of connection. I encourage you to follow that thread. There may be more to that old man hobbling along on his crutches than meets the eye.
Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in. They’re half technology and half religion.6 And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg. Garbage collection, introduced by Lisp in about 1960, is now widely considered to be a good thing. Dynamic typing, ditto, is growing in popularity. Lexical closures, introduced by Lisp in the early 1960s, are now, just barely, on the radar screen. Macros, introduced by Lisp in the mid 1960s, are still terra incognita. Obviously,
Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in. They’re half technology and half religion.6 And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg. Garbage collection, introduced by Lisp in about 1960, is now widely considered to be a good thing. Dynamic typing, ditto, is growing in popularity. Lexical closures, introduced by Lisp in the early 1960s, are now, just barely, on the radar screen. Macros, introduced by Lisp in the mid 1960s, are still terra incognita. Obviously, the median language has enormous momentum.
The safest kind were the ones that wanted Oracle experience. You never had to worry about those. You were also safe if they said they wanted C++ or Java developers. If they wanted Perl or Python programmers, that would be a bit frightening—that’s starting to sound like a company where the technical side, at least, is run by real hackers. If I had ever seen a job posting looking for Lisp hackers, I would have been really worried.
The pointy-haired boss miraculously combines two qualities that are common by themselves, but rarely seen together: (a) he knows nothing whatsoever about technology, and (b) he has very strong opinions about
Java was designed to fix some problems with C++. So there you have it: languages are not all equivalent. If you follow the trail through the pointy-haired boss’s brain to Java and then back through Java’s history to its origins, you end up holding an idea that contradicts the assumption you started with.
But if languages vary, he suddenly has to solve two simultaneous equations, trying to find an optimal balance between two things he knows nothing about: the relative suitability of the twenty or so leading languages for the problem he needs to solve, and the odds of finding programmers, libraries, etc. for each. If that’s what’s on the other side of the door, it is no surprise that the pointy-haired boss doesn’t want to open
Expressing the language in its own data structures turns out to be a very powerful feature. Ideas 8 and 9 together mean that you can write programs that write programs. That may sound like a bizarre idea, but it’s an everyday thing in Lisp. The most common way to do it is with something called a
I can think of three problems that could arise from using less common languages. Your programs might not work well with programs written in other languages. You might have fewer libraries at your disposal. And you might have trouble hiring programmers.
This is why we even hear about new languages like Perl and Python. We’re not hearing about these languages because people are using them to write Windows apps, but because people are using them on servers. And as software shifts off the desktop and onto servers (a future even Microsoft seems resigned to), there will be less and less pressure to use middle-of-the-road technologies.
libraries, their importance also depends on the application. For less demanding problems, the availability of libraries can outweigh the intrinsic power of the language. Where is the breakeven point? Hard to say exactly, but wherever it is, it is short of anything you’d be likely to call an application. If a company considers it self to be in the software business, and they’re writing an application that will be one of their products, then it will probably involve several hackers and take at least six months to write. In a project of that size, powerful languages probably start to outweigh the convenience of pre-existing libraries.
How many hackers do you need to hire, after all? Surely by now we all know that software is best developed by teams of less than ten people.
fact, choosing a more powerful language probably decreases the size of the team you need, because (a) if you use a more powerful language, you probably won’t need as many hackers, and (b) hackers who work in more advanced languages are likely to be smarter.
The most convenient measure of power is probably code size. The point of high-level languages is to give you bigger abstractions—bigger bricks, as it were, so you don’t need as many to build a wall of a given size. So the more powerful the language, the shorter the program (not simply in characters, of course, but in distinct elements).
the base language, you build on top of the base language a language for writing programs like yours, then write your program in it. The combined code can be much shorter than if you had written your whole program in the base language—indeed, this is how most compression algorithms work.
Fred Brooks described this phenomenon in his famous book The Mythical Man-Month, and everything I’ve seen has tended to confirm what he said.
Within large organizations, the phrase used to describe this approach is “industry best practice.” Its purpose is to shield the pointy-haired boss from responsibility: if he chooses something that is “industry best practice,” and the company loses, he can’t be blamed. He didn’t choose, the industry did. I believe this
Number 1, languages vary in power. Number 2, most managers deliberately ignore this. Between them, these two facts are literally a recipe for making money. ITA is an example of this recipe in action. If you want to win in a software business, just take on the hardest problem you can find, use the most powerful language you can get, and wait for your competitors’ pointy-haired bosses to revert to the mean.
It may be that the majority of programmers can’t tell a good language from a bad one. But that’s no different with any other tool. It doesn’t mean that it’s a waste of time to try designing a good language. Expert hackers can tell a good language when they see one, and they’ll use it. Expert hackers are a tiny minority, admittedly, but that tiny minority write all the good software, and their influence is such that the rest of the programmers will tend to use whatever language they use. Often, indeed, it is not merely influence but command: often the expert hackers are the very people who, as their bosses or faculty advisors, tell the other programmers what language to use.
So whether or not a language has to be good to be popular, I think a language has to be popular to be good. And it has to stay popular to stay good. The state of the art in programming languages doesn’t stand still. Though there is little change in the depths of the sea, in core language features, there is quite a lot up on the surface, in things like libraries and environments.
language also needs to have a book about it. The book should be thin, well-written, and full of good examples. Kernighan and Ritchie’s C Programming Language is the ideal here. At the moment I’d almost say that a language has to have a book published by O’Reilly. That’s becoming the test of mattering to hackers.
All other things being equal, no one wants to begin a program with a bunch of declarations. Anything that can be implicit, should be. The amount of boilerplate in a Java hello-world program is almost enough evidence, by itself, to convict.
I think language designers would do better to consider their target user to be a genius who will need to do things they never anticipated, rather than a bumbler who needs to be protected from himself. The bumbler will shoot himself in the foot anyway. You may save him from referring to variables in another module, but you can’t save him from writing a badly designed program to solve the wrong problem, and taking forever to do it.
To be attractive to hackers, a language must be good for writing the kinds of programs they want to write. And that means, perhaps surprisingly, that it has to be good for writing throwaway programs.
The project either gets bogged down, or the result is sterile and wooden: a shopping mall rather than a real downtown, Brasilia rather than Rome,
I think a lot of the advances that happen in programming languages in the next fifty years will have to do with library functions. I think future programming languages will have libraries that are as carefully designed as the core language.
or whatever, so much as about how to design great libraries.
But in practice I don’t think fast code comes primarily from things you do in the design of the language. As Knuth pointed out long ago, speed only matters in certain critical bottlenecks. And as many programmers have observed since, one is often mistaken about where these bottlenecks are.
good profiler, rather than by, say, making the language statically typed. You don’t need to know the type of every argument in every call in the program. You do need to be able to declare the types of arguments in the bottlenecks. And even more, you need to be able to find out where the bottlenecks are.
And in any case I think good profiling would go a long way toward fixing the problem: you’d soon learn what was expensive.
An active profiler could show graphically what’s happening in memory as a program’s running, or even make sounds that tell what’s happening.
It might be a good idea to make the byte code an official part of the language, and to allow programmers to use inline byte code in bottlenecks. Then such optimizations would be portable too.
The language can help here too. Good support for threads will enable all the users to share a single heap. It may also help to have persistent objects and/or language-level support for lazy loading.
knows that people sometimes ask for things they turn out not to want. To avoid wasting his time, he waits till the third or fourth time he’s asked to do something. By then whoever’s asking him may be fairly annoyed, but at least they probably really do want whatever they’re asking for.
“The best writing is rewriting,” wrote E. B. White. Every good writer knows this, and it’s true for software too. The most important part of design is redesign. Programming languages, especially, don’t get redesigned enough.
The trick is to realize that there’s no real contradiction here. You want to be optimistic and skeptical about two different things. You have to be optimistic about the possibility of solving the problem, but skeptical about the value of whatever solution you’ve got so far.
Everyone knows it’s not a good idea to have a language designed by a committee. Committees yield bad design. But I think the worst danger of committees is that they interfere with redesign. It’s so much work to introduce changes that no one wants to bother. Whatever a committee decides tends to stay that way, even if most of the members don’t like
Design doesn’t have to be new, but it has to be good. Research doesn’t have to be good, but it has to be new. I think these two paths converge at the top: the best design surpasses its predecessors by using new ideas, and the best research solves problems that are not only new, but worth solving.
The biggest difference is that you focus more on the user. Design begins by asking, who is this for and what do they need from it? A good architect, for example, does not begin by creating a design that he then imposes on the users, but by studying the intended users and figuring out what they need.
The point is, you have to pick some group of users. I don’t think you can even talk about good or bad design except with reference to some intended user.
You’re most likely to get good design if the intended users include the designer himself. When you design something for a group that doesn’t include you, it tends to be for people you consider less sophisticated than you, not more sophisticated.