Highlights: Hackers & Painters, by Paul Graham


Raw Generation of Kindle Notes, up for Summary + Review

Location 1419-1421

If a fairly good hacker is worth $80,000 a year at a big company, then a smart hacker working very hard without any corporate bullshit to slow him down should be able to do work worth about $3 million a year. Like all back-of-the-envelope calculations, this

Location 1424-1429

They don’t change the laws of wealth creation. They just represent a point at the far end of the curve. There is a conservation law at work here: if you want to make a million dollars, you have to endure a million dollars’ worth of pain. For example, one way to make a million dollars would be to work for the Post Office your whole life, and save every penny of your salary. Imagine the stress of working for the Post Office for fifty years. In a startup you compress all this stress into three or four years. You do tend to get a certain bulk discount if you buy the economy-size pain, but you can’t evade the fundamental conservation law. If starting a startup were easy, everyone would do it.

Location 1504-1507

This is why so many of the best programmers are libertarians. In our world, you sink or swim, and there are no excuses. When those far removed from the creation of wealth—undergraduates, reporters, politicians—hear that the richest 5% of the people have half the total wealth, they tend to think injustice! An experienced programmer would be more likely to think is that all? The top 5% of programmers probably write 99% of the good software.

Location 1512-1516

In industrialized countries, people belong to one institution or another at least until their twenties. After all those years you get used to the idea of belonging to a group of people who all get up in the morning, go to some set of buildings, and do things that they do not, ordinarily, enjoy doing. Belonging to such a group becomes part of your identity: name, age, role, institution. If you have to introduce yourself, or someone else describes you, it will be as something like, John Smith, age 10, a student at such and such elementary school, or John Smith, age 20, a student at such and such college.

Location 1526-1528

If wealth means what people want, companies that move things also create wealth. Ditto for many other kinds of companies that don’t make anything physical. Nearly all companies exist to do something people want.

Location 1528-1532

But here there is another layer that tends to obscure the underlying reality. In a company, the work you do is averaged together with a lot of other people’s. You may not even be aware you’re doing something people want. Your contribution may be indirect. But the company as a whole must be giving people something they want, or they won’t make any money. And if they are paying you x dollars a year, then on average you must be contributing at least x dollars a year worth of work, or the company will be spending more than it makes, and will go out of business.

Location 1534-1535

A more direct way to put it would be: you need to start doing something people want. You don’t need to join a company to do that. All a company is is a group of people working together to do something people want. It’s doing something people want that matters, not joining the group.

Location 1545-1548

Companies are not set up to reward people who want to do this. You can’t go to your boss and say, I’d like to start working ten times as hard, so will you please pay me ten times as much? For one thing, the official fiction is that you are already working as hard as you can. But a more serious problem is that the company has no way of measuring the value of your work.

Location 1565-1567

To get rich you need to get yourself in a situation with two things, measurement and leverage. You need to be in a position where your performance can be measured, or there is no way to get paid more by doing more. And you have to have leverage, in the sense that the decisions you make have a big effect.

Location 1570-1572

An example of a job with both measurement and leverage would be lead actor in a movie. Your performance can be measured in the gross of the movie. And you have leverage in the sense that your performance can make or break it. CEOs also have both measurement and

Location 1580-1581

If you can’t measure the value of the work done by individual employees, you can get close. You can measure the value of the work done by small groups.

Location 1598-1601

That’s the real point of startups. Ideally, you are getting together with a group of other people who also want to work a lot harder, and get paid a lot more, than they would in a big company. And because startups tend to get founded by self-selecting groups of ambitious people who already know one another (at least by reputation), the level of measurement is more precise than you get from smallness alone. A startup is not merely ten people, but ten people like you. Steve Jobs

Location 1609-1610

Startups offer anyone a way to be in a situation with measurement and leverage. They allow measurement because they’re small, and they offer leverage because they make money by inventing new technology.

Location 1610-1614

What is technology? It’s technique. It’s the way we all do things. And when you discover a new way to do things, its value is multiplied by all the people who use it. It is the proverbial fishing rod, rather than the fish. That’s the difference between a startup and a restaurant or a barber shop. You fry eggs or cut hair one customer at a time. Whereas if you solve a technical problem that a lot of people care about, you help everyone who uses your solution. That’s leverage.

Location 1620-1624

Big companies can develop technology. They just can’t do it quickly. Their size makes them slow and prevents them from rewarding employees for the extraordinary effort required. So in practice big companies only get to develop technology in fields where large capital requirements prevent startups from competing with them, like microprocessors, power plants, or passenger aircraft. And even in those fields they depend heavily on startups for components and ideas. It’s obvious

Location 1625-1628

McDonald’s, for example, grew big by designing a system, the McDonald’s franchise, that could then be reproduced at will all over the face of the earth. A McDonald’s franchise is controlled by rules so precise that it is practically a piece of software. Write once, run everywhere. Ditto for Wal-Mart. Sam Walton got rich not by being a retailer, but by designing a new kind of store.

Location 1632-1636

What this meant in practice was that we deliberately sought hard problems. If there were two features we could add to our software, both equally valuable in proportion to their difficulty, we’d always take the harder one. Not just because it was more valuable, but because it was harder. We delighted in forcing bigger, slower competitors to follow us over difficult ground. Like guerillas, startups prefer the difficult terrain of the mountains, where the troops of the central government can’t follow. I can remember times when we were just exhausted after wrestling all day with some horrible technical problem. And I’d be delighted, because something that was hard

Location 1658-1662

There is, as I said before, a large random multiplier in the success of any company. So in practice the deal is not that you’re 30 times as productive and get paid 30 times as much. It is that you’re 30 times as productive, and get paid between zero and a thousand times as much. If the mean is 30x, the median is probably zero. Most startups tank, and not just the dog food portals we all heard about during the Internet Bubble. It’s common for a startup to be developing a genuinely good product, take slightly too long to do it, run out of money, and have to shut down.

Location 1664-1668

Startups, like mosquitos, tend to be an all-or-nothing proposition. And you don’t generally know which of the two you’re going to get till the last minute. Via web came close to tanking several times. Our trajectory was like a sine wave. Fortunately we got bought at the top of the cycle, but it was damned close. While we were visiting Yahoo in California to talk about selling the company to them, we had to borrow a conference room to reassure an investor who was about to back out of a new round of funding that we needed to stay alive.

Location 1672-1675

The closest you can get is by selling your startup in the early stages, giving up upside (and risk) for a smaller but guaranteed payoff. We had a chance to do this, and stupidly, as we then thought, let it slip by. After that we became comically eager to sell. For the next year or so, if anyone expressed the slightest curiousity about Via web we would try to sell them the company. But there were no takers, so we had to keep going.

Location 1680-1682

think it’s a good idea to get bought, if you can. Running a business is different from growing one. It is just as well to let a big company take over once you reach cruising altitude. It’s also financially wiser, because selling allows you to diversify. What would you think of a financial advisor who put all his client’s assets into one volatile stock?

Location 1683-1684

How do you get bought? Mostly by doing the same things you’d do if you didn’t intend to sell the company. Being profitable, for example. But getting bought is also an art in its own right, and one that we spent a lot of time trying to master.

Location 1688-1690

In both cases, what it all comes down to is users. You’d think that a company about to buy you would do a lot of research and decide for themselves how valuable your technology was. Not at all. What they go by is the number of users you have.

Location 1690-1692

In effect, acquirers assume the customers know who has the best technology. And this is not as stupid as it sounds. Users are the only real proof that you’ve created wealth. Wealth is what people want, and if people aren’t using your software, maybe it’s not just because you’re bad at marketing. Maybe

Location 1693-1695

Venture capitalists have a list of danger signs to watch out for. Near the top is the company run by techno-weenies who are obsessed with solving interesting technical problems, instead of making users happy. In a startup, you’re not just trying to solve problems. You’re trying to solve problems that users care about.

Location 1698-1700

Number of users may not be the perfect test, but it will be very close. It’s what acquirers care about. It’s what revenues depend on. It’s what makes competitors unhappy. It’s what impresses reporters, and potential new users. Certainly it’s a better test than your a priori notions of what problems are important to solve, no matter how technically adept you are.

Location 1702-1703

rightly—taking a long time to develop a product. Now we can recognize this as something hackers already know to avoid: premature optimization. Get a version 1.0 out there as soon as you can. Until you have some users to measure, you’re optimizing based on guesses.

Location 1702-1703

rightly—taking a long time to develop a product. Now we can recognize this as something hackers already know to avoid: premature optimization. Get a version

Location 1702-1703

rightly—taking a long time to develop a product. Now we can recognize this as something hackers already know to avoid: premature optimization. Get a version 1.0 out there as soon as you can. Until you have some users to measure, you’re optimizing based on guesses.

Location 1722-1727

Remember what a startup is, economically: a way of saying, I want to work faster. Instead of accumulating money slowly by being paid a regular wage for fifty years, I want to get it over with as soon as possible. So governments that forbid you to accumulate wealth are in effect decreeing that you work slowly. They’re willing to let you earn $3 million over fifty years, but they’re not willing to let you work so hard that you can do it in two. They are like the corporate boss that you can’t go to and say, I want to work ten times as hard, so please pay me ten times a much. Except this is not a boss you can escape by starting your own company.

Location 1730-1731

Without the incentive of wealth, no one wants to do it. Engineers will work on sexy projects like fighter planes and moon rockets for ordinary salaries, but more mundane technologies like light bulbs or semiconductors have to be developed by entrepreneurs.

Location 1749-1750

No one complains when a few people surpass all the rest at playing chess or writing novels, but when a few people make more money than the rest, we get editorials saying this is wrong.

Location 1751-1754

I think there are three reasons we treat making money as different: the misleading model of wealth we learn as children; the disreputable way in which, till recently, most fortunes were accumulated; and the worry that great variations in income are somehow bad for society. As far as I can tell, the first is mistaken, the second outdated, and the third empirically false. Could it be that, in a modern democracy, variation in income is actually a sign of health?

Location 1769-1772

Because kids are unable to create wealth, whatever they have has to be given to them. And when wealth is something you’re given, then of course it seems that it should be distributed equally.2 As in most families it is. The kids see to that. “Unfair,” they cry, when one sibling gets more than another.

Location 1776-1778

You get paid by doing or making something people want, and those who make more money are often simply better at doing what people want. Top actors make a lot more money than B-list actors. The B-list actors might be almost as charismatic, but when people go to the theater and look at the list of movies playing, they want that extra oomph that the big stars have.

Location 1779-1781

Such tricks account for some variation in wealth, and indeed for some of the biggest individual fortunes, but they are not the root cause of variation in income. The root cause of variation in income, as Occam’s Razor implies, is the same as the root cause of variation in every other human skill.

Location 1783-1787

and baseball players 72 times as much. Editorials quote this kind of statistic with horror. But I have no trouble imagining that one person could be 100 times as productive as another. In ancient Rome the price of slaves varied by a factor of 50 depending on their skills.4 And that’s without considering motivation, or the extra leverage in productivity that you can get from modern technology.

Location 1789-1790

How much someone’s work is worth is not a policy question. It’s something the market already determines.

Location 1794-1795

And regardless of the case with CEOs, it’s hard to see how anyone could argue that the salaries of professional basketball players don’t reflect supply and demand.

Location 1801-1803

saying? In a free market, prices are determined by what buyers want. People like baseball more than poetry, so baseball players make more than poets. To say that a certain kind of work is underpaid is thus identical with saying that people want the wrong things.

Location 1805-1813

Then you’re saying that it’s unjust that people want the wrong things. It’s lamentable that people prefer reality TV and corndogs to Shakespeare and steamed vegetables, but unjust? That seems like saying that blue is heavy, or that up is circular. The appearance of word “unjust” here is the unmistakable spectral signature of the Daddy Model. Why else would this idea occur in this odd context? Whereas if the speaker were still operating on the Daddy Model, and saw wealth as something that flowed from a common source and had to be shared out, rather than something generated by doing what other people wanted, this is exactly what you’d get on noticing that some people made much more than others. When we talk about “unequal distribution of income,” we should also ask, where does that income come from?8 Who made the wealth it represents? Because to the extent that income varies simply according to how much wealth people create, the

Location 1854-1856

Will technology increase the gap between rich and poor? It will certainly increase the gap between the productive and the unproductive. That’s the whole point of technology. With a tractor an energetic farmer could plow six times as much land in a day as he could with a team of horses. But only if he mastered a new kind of farming.

Location 1874-1880

Indeed, as with expensive cars, if you’re determined to spend a lot of money on a watch, you have to put up with some inconvenience to do it: as well as keeping worse time, mechanical watches have to be wound. The only thing technology can’t cheapen is brand. Which is precisely why we hear ever more about it. Brand is the residue left as the substantive differences between rich and poor evaporate. But what label you have on your stuff is a much smaller matter than having it versus not having it. In 1900, if you kept a carriage, no one asked what year or brand it was. If you had one, you were rich. And if you weren’t rich, you took the omnibus or walked. Now even the poorest Americans drive cars, and it is only because we’re so well trained by advertising that we can even recognize the especially expensive ones.

Location 1886-1887

The houses are made using the same construction techniques and contain much the same objects. It’s inconvenient to do something expensive and custom.

Location 1895-1898

Materially and socially, technology seems to be decreasing the gap between the rich and the poor, not increasing it. If Lenin walked around the offices of a company like Yahoo or Intel or Cisco, he’d think communism had won. Everyone would be wearing the same clothes, have the same kind of office (or rather, cubicle) with the same furnishings, and address one another by their first names instead of by honorifics.

Location 1903-1905

Indeed, it may even be false, in industrial democracies. In a society of serfs and warlords, certainly, variation in income is a sign of an underlying problem. But serfdom is not the only cause of variation in income. A 747 pilot doesn’t make 40 times as much as a checkout clerk because he is a warlord who somehow holds her in thrall. His skills are simply much more valuable.

Location 1906-1908

propose an alternative idea: that in a modern society, increasing variation in income is a sign of health. Technology seems to increase the variation in productivity at faster than linear rates. If we don’t see corresponding variation in income, there are three possible explanations: (a) that technical innovation has

Location 1911-1912

The only option, if you’re going to have an increasingly prosperous society without increasing variation in income, seems to be ©, that people will create a lot of wealth without being paid

Location 1917-1919

All the un fun kinds of wealth creation slow dramatically in a society that confiscates private fortunes. We can confirm this empirically. Suppose you hear a strange noise that you think may be due to a nearby fan. You turn the fan off, and the noise stops. You turn the fan back on, and the noise starts again. Off, quiet. On, noise. In the absence of other information, it would seem the noise is caused by the

Location 1926-1928

you suppress variations in income, whether by stealing private fortunes, as feudal rulers used to do, or by taxing them away, as some modern governments have done, the result always seems to be the same. Society as a whole ends up poorer.

Location 1931-1934

You need rich people in your society not so much because in spending their money they create jobs, but because of what they have to do to get rich. I’m not talking about the trickle-down effect here. I’m not saying that if you let Henry Ford get rich, he’ll hire you as a waiter at his next party. I’m saying that he’ll make you a tractor to replace your horse.

Location 2314-2317

The more convenient language that you feed to the compiler is called a high-level language. It lets you build your programs out of powerful commands, like “do something n times” instead of wimpy ones like “add two numbers.” When you get to build your programs out of bigger concepts, you don’t need to use as many of them. Written in our imaginary high-level language, our program is only a fifth as long. And if there were a mistake in it, it would be easy to see.

Location 2321-2323

Compilers aren’t the only way to implement high-level languages. You could also use an interpreter, which examines your program one piece at a time and executes the corresponding machine language commands, instead of translating the whole thing into machine language and running that.

Location 2324-2327

The high-level language that you feed to the compiler is also known as source code, and the machine language translation it generates is called object code. When you buy commercial software, you usually only get the object code. (Object code is so hard to read that it is effectively encrypted, thus protecting the company’s trade secrets.) But lately there is an alternative approach: open source software, where you get the source code as well, and are free to modify it if you want.

Location 2338-2339

The average end user may not need the source code of their word processor, but when you really need reliability, there are solid engineering reasons for insisting on open source.

Location 2346-2349

So which one do you use? Ah, well, there is a great deal of disagreement about that. Part of the problem is that if you use a language for long enough, you start to think in it. So any language that’s substantially different feels terribly awkward, even if there’s nothing intrinsically wrong with it. Inexperienced programmers’ judgements about the relative merits of programming languages are often skewed by this effect.

Location 2358-2359

Just as high-level languages are more abstract than assembly language, some high-level languages are more abstract than others. For example, C is quite low-level, almost a portable assembly

Location 2362-2363

Prolog, for example. It has fabulously powerful abstractions for solving about 2% of problems, and the rest of the time you’re bending over backward to misuse these abstractions to write de facto Pascal programs.

Location 2369-2370

The biggest debate in language design is probably the one between Those who think that a language should prevent programmers from doing stupid things, and those who think programmers should be allowed to do whatever they want.

Location 2371-2374

Partisans of permissive languages ridicule the other sort as “B&D” (bondage and discipline) languages, with the rather impudent implication that those who like to program in them are bottoms. I don’t know what the other side call languages like Perl. Perhaps they are not the sort of people to make up amusing names for the opposition. The debate resolves

Location 2375-2376

One of the more active questions at the moment is static versus dynamic typing. In a statically-typed language, you have to know the kind of values each variable can have at the time you write the program. With dynamic typing, you can set any variable to any value,

Location 2391-2392

The disadvantage, critics would counter, is that adding things without looking at what was already there tends to produce the same results in programs that it does in buildings.

Location 2393-2395

With typing you have to choose one or the other. But the object-orientedness of a language is a matter of degree. Indeed, there are two senses of object-oriented: some languages are object-oriented in the sense that they let you program in that style, and others in the sense that they force you to.

Location 2428-2430

undergrads who believe they have to learn it to get a job. When I say Java won’t turn out to be a successful language, I mean something more specific: that Java will turn out to be an evolutionary dead-end, like Cobol.

Location 2439-2440

because the space of possibilities is smaller, and partly because mutations are not random. Language designers deliberately incorporate ideas from other languages.

Location 2439-2440

Convergence is more likely for languages partly because the space of possibilities is smaller, and partly because mutations are not random. Language designers deliberately incorporate ideas from other languages.

Location 2443-2445

Any programming language can be divided into two parts: some set of fundamental operators that play the role of axioms, and the rest of the language, which could in principle be written in terms of these fundamental operators.

Location 2451-2452

have a hunch that the main branches of the evolutionary tree pass through the languages that have the smallest, cleanest cores. The more of a language you can write in itself, the better.

Location 2454-2457

There hasn’t been a lot of progress in that department so far. My guess is that a hundred years from now people will still tell computers what to do using programs we would recognize as such. There may be tasks that we solve now by writing programs and that in a hundred years you won’t have to write programs to solve, but I think there will still be a good deal of programming of the type we do today.

Location 2460-2463

Languages evolve slowly because they’re not really technologies. Languages are notation. A program is a formal description of the problem you want a computer to solve for you. So the rate of evolution in programming languages is more like the rate of evolution in mathematical notation than, say, transportation or communications. Mathematical notation does evolve, but not with the giant leaps you see in technology.

Location 2464-2468

That’s kind of hard to imagine. And indeed, the most likely prediction in the speed department may be that Moore’s Law will stop working. Anything that’s supposed to double every eighteen months seems likely to run up against some kind of fundamental limit eventually. But I have no trouble believing that computers will be very much faster. Even if they only end up being a paltry million times faster, that should change the ground rules for programming languages substantially. Among other things, there will be more room for what would now be considered slow languages, meaning languages that don’t yield very efficient code.

Location 2471-2473

And there is another class of problems that inherently have an unlimited capacity to soak up cycles: image rendering, cryptography, simulations. If some applications can be increasingly inefficient while others continue to demand all the speed the hardware can deliver, faster computers will mean that languages have to cover an ever wider range of efficiencies.

Location 2476-2477

People thirty years ago would be astonished at how casually we make long distance phone calls.

Location 2478-2479

can already tell you what’s going to happen to all those extra cycles that faster hardware is going to give us in the next hundred years. They’re nearly all going to be wasted.

Location 2487-2490

There’s good waste, and bad waste. I’m interested in good waste—the kind where, by spending more, we can get simpler designs. How will we take advantage of the opportunities to waste cycles that we’ll get from new, faster hardware? The desire for speed is so deeply ingrained in us, with our puny computers, that it will take a conscious effort to overcome it. In language design, we should be consciously seeking out situations where we can trade efficiency for even the smallest increase in convenience.

Location 2491-2494

Semantically, strings are more or less a subset of lists in which the elements are characters. So why do you need a separate data type? You don’t, really. Strings only exist for efficiency. But it’s lame to clutter up the semantics of a language with hacks to make programs run faster. Having strings in a language seems to be a case of premature optimization.

Location 2494-2499

think of the core of a language as a set of axioms, surely it’s gross to have additional axioms that add no expressive power, simply for the sake of efficiency. Efficiency is important, but I don’t think that’s the right way to get it. The right way to solve that problem is to separate the meaning of a program from the implementation details. Instead of having both lists and strings, have just lists, with some way to give the compiler optimization advice that will allow it to lay out strings as contiguous bytes if necessary.1

Location 2502-2504

The word “essay” comes from the French verb “essayer,” which means “to try.” An essay, in the original sense, is something you write to try to figure something out. This happens in software too. I think some of the best programs were essays, in the sense that the authors didn’t know when they started exactly what they were trying to write.

Location 2510-2514

Inefficient software isn’t gross. What’s gross is a language that makes programmers do needless work. Wasting programmer time is the true inefficiency, not wasting machine time. This will become ever more clear as computers get faster. I think getting rid of strings is already something we could bear to think about. We did it in Arc, and it seems to be a win; some operations that would be awkward to describe as regular expressions can be described easily as recursive functions.

Location 2515-2516

we get rid of arrays, for example? After all, they’re just a subset of hash tables where the keys are vectors of integers. Will we replace hash tables themselves with lists?

Location 2517-2518

Logically, you don’t need to have a separate notion of numbers, because you can represent them as lists: the integer n could be represented as a list of n elements. You can do math this way. It’s just unbearably inefficient.

Location 2527-2529

layers of software between the application and the hardware. This too is a trend we see happening already: many recent languages are compiled into byte code. Bill Woods once told me that, as a rule of thumb, each layer of interpretation costs a factor of ten in speed. This extra cost buys you flexibility.

Location 2536-2539

The more of your application you can push down into a language for writing that type of application, the more of your software will be reusable. Somehow the idea of reusability got attached to object-oriented programming in the 1980s, and no amount of evidence to the contrary seems to be able to shake it free. But although some object-oriented software is reusable, what makes it reusable is its bottom-upness,

Location 2539-2545

Consider libraries: they’re reusable because they’re language, whether they’re written in an object-oriented style or not. I don’t predict the demise of object-oriented programming, by the way. Though I don’t think it has much to offer good programmers, except in certain specialized domains, it is irresistible to large organizations. Object-oriented programming offers a sustainable way to write spaghetti code. It lets you accrete programs as a series of patches. Large organizations always tend to develop software this way, and I expect this to be as true in a hundred years as it is today. As long as we’re talking about the future, we had better talk about parallel computation, because that’s where this idea seems to live. At any given time, it always seems to be something that’s going to happen in the future.

Location 2555-2557

in parallel. And this will, like asking for specific implementations of data structures, be something that you do fairly late in the life of a program, when you try to optimize it. Version 1s will ordinarily ignore any advantages to be got from parallel computation, just as they will ignore advantages to be got from specific representations of data.

Location 2575-2577

It’s not true that those who can’t do, teach (some of the best hackers I know are professors), but it is true that there are a lot of things that those who teach can’t do. Research imposes constraining caste restrictions. In any academic field, there are topics that are ok to work on and others

Location 2575-2579

It’s not true that those who can’t do, teach (some of the best hackers I know are professors), but it is true that there are a lot of things that those who teach can’t do. Research imposes constraining caste restrictions. In any academic field, there are topics that are ok to work on and others that aren’t. Unfortunately the distinction between acceptable and forbidden topics is usually based on how intellectual the work sounds when described in research papers, rather than how important it is for getting good results. The extreme case is probably literature; people studying literature rarely say anything that would be of the slightest use to those producing it. Though

Location 2583-2584

The trend is not merely toward languages being developed as open source projects rather than “research,” but toward languages being designed by the application programmers who need to use them, rather than by compiler writers. This seems a good trend and I expect it to continue.

Location 2591-2593

You’d think it would be obvious to creatures as lazy as us how to express a program with the least effort. In fact, our ideas about what’s possible tend to be so limited by whatever language we think in that easier formulations of programs seem very surprising. They’re something you have to

Location 2594-2597

Not the length in characters, of course, but the length in distinct syntactic elements—basically, the size of the parse tree. It may not be quite true that the shortest program is the least work to write, but it’s close enough that you’re better off aiming for the solid target of brevity than the fuzzy, nearby one of least work. Then the algorithm for language design becomes: look at a program and ask, is there a shorter way to write this?

Location 2605-2610

assume infrastructure that didn’t exist in 1960. For example, a language in which indentation is significant, like Python, would not work very well on printer terminals. But putting such problems aside— assuming, for example, that programs were all just written on paper—would programmers of the 1960s have liked writing programs in the languages we use now? I think so. Some of the less imaginative ones, who had artifacts of early languages built into their ideas of what a program was, might have had trouble. (How can you manipulate data without doing pointer arithmetic? How can you implement flowcharts without gotos?) But I think the smartest programmers would have had no trouble making the most of present-day languages, if they’d had them.

Location 2628-2631

Eric Raymond has written an essay called “How to Become a Hacker,” and in it, among other things, he tells would-be hackers what languages they should learn. He suggests starting with Python and Java, because they are easy to learn. The serious hacker will also want to learn C, in order to hack Unix, and Perl for system administration and CGI scripts. Finally, the truly serious hacker should consider learning Lisp:

Location 2641-2643

Why not? Programming languages are just tools, after all. If Lisp really does yield better programs, you should use it. And if it doesn’t, then who needs

Location 2653-2655

This is especially true in a startup. In a big company, you can do what all the other big companies are doing. But a startup can’t do what all the other startups do. I don’t think a lot of people realize this, even in startups.

Location 2671-2676

We didn’t know anything about marketing, or hiring people, or raising money, or getting customers. Neither of us had ever even had what you would call a real job. The only thing we were good at was writing software. We hoped that would save us. Any advantage we could get in the software department, we would take. So you could say that using Lisp was an experiment. Our hypothesis was that if we wrote our software in Lisp, we’d be able to get features done faster than our competitors, and also to do things in our software that they couldn’t do. And because Lisp was so high-level, we wouldn’t need a big development team, so our costs would be lower. If this were so, we could offer a better product for less money, and still make a profit.

Location 2704-2706

And the reason everyone doesn’t use it is that programming languages are not merely technologies, but habits of mind as well, and nothing changes slower. Of course, both these answers need explaining. I’ll begin with a shockingly controversial statement: programming languages vary in power.

Location 2710-2712

What’s less often understood is that there is a more general principle here: that if you have a choice of several languages, it is, all other things being equal, a mistake to program in anything but the most powerful one.3

Location 2738-2742

When we switch to the point of view of a programmer using any of the languages higher up the power continuum, however, we find that he in turn looks down upon Blub. How can you get anything done in Blub? It doesn’t even have y. By induction, the only programmers in a position to see all the differences in power between the various languages are those who understand the most powerful one. (This is probably what Eric Raymond meant about Lisp making you a better programmer.) You

Location 2747-2748

What I will say is that I think Lisp is at the top. And to support this claim I’ll tell you about one of the things I find missing when I look at the other four languages. How can you get anything done in them, I think, without macros?

Location 2755-2757

You write programs in the parse trees that get generated within the compiler when other languages are parsed. But these parse trees are fully accessible to your programs. You can write programs that manipulate them. In Lisp, these programs are called macros. They are programs that write programs.

Location 2767-2770

suspicious person might begin to wonder if there was some correlation here. A big chunk of our code was doing things that are hard to do in other languages. The resulting software did things our competitors’ software couldn’t do. Maybe there was some kind of connection. I encourage you to follow that thread. There may be more to that old man hobbling along on his crutches than meets the eye.

Location 2780-2785

Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in. They’re half technology and half religion.6 And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg. Garbage collection, introduced by Lisp in about 1960, is now widely considered to be a good thing. Dynamic typing, ditto, is growing in popularity. Lexical closures, introduced by Lisp in the early 1960s, are now, just barely, on the radar screen. Macros, introduced by Lisp in the mid 1960s, are still terra incognita. Obviously,

Location 2780-2785

Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in. They’re half technology and half religion.6 And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg. Garbage collection, introduced by Lisp in about 1960, is now widely considered to be a good thing. Dynamic typing, ditto, is growing in popularity. Lexical closures, introduced by Lisp in the early 1960s, are now, just barely, on the radar screen. Macros, introduced by Lisp in the mid 1960s, are still terra incognita. Obviously, the median language has enormous momentum.

Location 2796-2799

The safest kind were the ones that wanted Oracle experience. You never had to worry about those. You were also safe if they said they wanted C++ or Java developers. If they wanted Perl or Python programmers, that would be a bit frightening—that’s starting to sound like a company where the technical side, at least, is run by real hackers. If I had ever seen a job posting looking for Lisp hackers, I would have been really worried.

Location 2803-2805

The pointy-haired boss miraculously combines two qualities that are common by themselves, but rarely seen together: (a) he knows nothing whatsoever about technology, and (b) he has very strong opinions about

Location 2818-2820

Java was designed to fix some problems with C++. So there you have it: languages are not all equivalent. If you follow the trail through the pointy-haired boss’s brain to Java and then back through Java’s history to its origins, you end up holding an idea that contradicts the assumption you started with.

Location 2827-2829

But if languages vary, he suddenly has to solve two simultaneous equations, trying to find an optimal balance between two things he knows nothing about: the relative suitability of the twenty or so leading languages for the problem he needs to solve, and the odds of finding programmers, libraries, etc. for each. If that’s what’s on the other side of the door, it is no surprise that the pointy-haired boss doesn’t want to open

Location 2919-2921

Expressing the language in its own data structures turns out to be a very powerful feature. Ideas 8 and 9 together mean that you can write programs that write programs. That may sound like a bizarre idea, but it’s an everyday thing in Lisp. The most common way to do it is with something called a

Location 2946-2948

I can think of three problems that could arise from using less common languages. Your programs might not work well with programs written in other languages. You might have fewer libraries at your disposal. And you might have trouble hiring programmers.

Location 2953-2956

This is why we even hear about new languages like Perl and Python. We’re not hearing about these languages because people are using them to write Windows apps, but because people are using them on servers. And as software shifts off the desktop and onto servers (a future even Microsoft seems resigned to), there will be less and less pressure to use middle-of-the-road technologies.

Location 2956-2961

libraries, their importance also depends on the application. For less demanding problems, the availability of libraries can outweigh the intrinsic power of the language. Where is the breakeven point? Hard to say exactly, but wherever it is, it is short of anything you’d be likely to call an application. If a company considers it self to be in the software business, and they’re writing an application that will be one of their products, then it will probably involve several hackers and take at least six months to write. In a project of that size, powerful languages probably start to outweigh the convenience of pre-existing libraries.

Location 2962-2963

How many hackers do you need to hire, after all? Surely by now we all know that software is best developed by teams of less than ten people.

Location 2964-2966

fact, choosing a more powerful language probably decreases the size of the team you need, because (a) if you use a more powerful language, you probably won’t need as many hackers, and (b) hackers who work in more advanced languages are likely to be smarter.

Location 2977-2979

The most convenient measure of power is probably code size. The point of high-level languages is to give you bigger abstractions—bigger bricks, as it were, so you don’t need as many to build a wall of a given size. So the more powerful the language, the shorter the program (not simply in characters, of course, but in distinct elements).

Location 2981-2982

the base language, you build on top of the base language a language for writing programs like yours, then write your program in it. The combined code can be much shorter than if you had written your whole program in the base language—indeed, this is how most compression algorithms work.

Location 2986-2987

Fred Brooks described this phenomenon in his famous book The Mythical Man-Month, and everything I’ve seen has tended to confirm what he said.

Location 3007-3009

Within large organizations, the phrase used to describe this approach is “industry best practice.” Its purpose is to shield the pointy-haired boss from responsibility: if he chooses something that is “industry best practice,” and the company loses, he can’t be blamed. He didn’t choose, the industry did. I believe this

Location 3016-3018

Number 1, languages vary in power. Number 2, most managers deliberately ignore this. Between them, these two facts are literally a recipe for making money. ITA is an example of this recipe in action. If you want to win in a software business, just take on the hardest problem you can find, use the most powerful language you can get, and wait for your competitors’ pointy-haired bosses to revert to the mean.

Location 3099-3104

It may be that the majority of programmers can’t tell a good language from a bad one. But that’s no different with any other tool. It doesn’t mean that it’s a waste of time to try designing a good language. Expert hackers can tell a good language when they see one, and they’ll use it. Expert hackers are a tiny minority, admittedly, but that tiny minority write all the good software, and their influence is such that the rest of the programmers will tend to use whatever language they use. Often, indeed, it is not merely influence but command: often the expert hackers are the very people who, as their bosses or faculty advisors, tell the other programmers what language to use.

Location 3109-3111

So whether or not a language has to be good to be popular, I think a language has to be popular to be good. And it has to stay popular to stay good. The state of the art in programming languages doesn’t stand still. Though there is little change in the depths of the sea, in core language features, there is quite a lot up on the surface, in things like libraries and environments.

Location 3118-3121

Let’s start by acknowledging one external factor that does affect the popularity of a programming language. To become popular, a programming language has to be the scripting language of a popular system. Fortran and Cobol were the scripting languages of early IBM mainframes. C was the scripting language of Unix, and so, later, were Perl and Python. Tcl is the scripting language of Tk, Visual Basic of Windows, (a form of) Lisp of Emacs, PHP of web servers, and Java and Javascript of web browsers.

Location 3129-3132

language also needs to have a book about it. The book should be thin, well-written, and full of good examples. Kernighan and Ritchie’s C Programming Language is the ideal here. At the moment I’d almost say that a language has to have a book published by O’Reilly. That’s becoming the test of mattering to hackers.

Location 3149-3150

All other things being equal, no one wants to begin a program with a bunch of declarations. Anything that can be implicit, should be. The amount of boilerplate in a Java hello-world program is almost enough evidence, by itself, to convict.

Location 3158-3161

I think language designers would do better to consider their target user to be a genius who will need to do things they never anticipated, rather than a bumbler who needs to be protected from himself. The bumbler will shoot himself in the foot anyway. You may save him from referring to variables in another module, but you can’t save him from writing a badly designed program to solve the wrong problem, and taking forever to do it.

Location 3177-3178

To be attractive to hackers, a language must be good for writing the kinds of programs they want to write. And that means, perhaps surprisingly, that it has to be good for writing throwaway programs.

Location 3183-3184

The project either gets bogged down, or the result is sterile and wooden: a shopping mall rather than a real downtown, Brasilia rather than Rome,

Location 3204-3205

I think a lot of the advances that happen in programming languages in the next fifty years will have to do with library functions. I think future programming languages will have libraries that are as carefully designed as the core language.

Location 3206-3207

or whatever, so much as about how to design great libraries.

Location 3213-3215

But in practice I don’t think fast code comes primarily from things you do in the design of the language. As Knuth pointed out long ago, speed only matters in certain critical bottlenecks. And as many programmers have observed since, one is often mistaken about where these bottlenecks are.

Location 3215-3217

good profiler, rather than by, say, making the language statically typed. You don’t need to know the type of every argument in every call in the program. You do need to be able to declare the types of arguments in the bottlenecks. And even more, you need to be able to find out where the bottlenecks are.

Location 3219-3220

And in any case I think good profiling would go a long way toward fixing the problem: you’d soon learn what was expensive.

Location 3227-3227

An active profiler could show graphically what’s happening in memory as a program’s running, or even make sounds that tell what’s happening.

Location 3236-3237

It might be a good idea to make the byte code an official part of the language, and to allow programmers to use inline byte code in bottlenecks. Then such optimizations would be portable too.

Location 3249-3250

The language can help here too. Good support for threads will enable all the users to share a single heap. It may also help to have persistent objects and/or language-level support for lazy loading.

Location 3255-3257

knows that people sometimes ask for things they turn out not to want. To avoid wasting his time, he waits till the third or fourth time he’s asked to do something. By then whoever’s asking him may be fairly annoyed, but at least they probably really do want whatever they’re asking for.

Location 3279-3281

“The best writing is rewriting,” wrote E. B. White. Every good writer knows this, and it’s true for software too. The most important part of design is redesign. Programming languages, especially, don’t get redesigned enough.

Location 3283-3285

The trick is to realize that there’s no real contradiction here. You want to be optimistic and skeptical about two different things. You have to be optimistic about the possibility of solving the problem, but skeptical about the value of whatever solution you’ve got so far.

Location 3300-3302

Everyone knows it’s not a good idea to have a language designed by a committee. Committees yield bad design. But I think the worst danger of committees is that they interfere with redesign. It’s so much work to introduce changes that no one wants to bother. Whatever a committee decides tends to stay that way, even if most of the members don’t like

Location 3334-3336

Design doesn’t have to be new, but it has to be good. Research doesn’t have to be good, but it has to be new. I think these two paths converge at the top: the best design surpasses its predecessors by using new ideas, and the best research solves problems that are not only new, but worth solving.

Location 3338-3340

The biggest difference is that you focus more on the user. Design begins by asking, who is this for and what do they need from it? A good architect, for example, does not begin by creating a design that he then imposes on the users, but by studying the intended users and figuring out what they need.

Location 3352-3353

The point is, you have to pick some group of users. I don’t think you can even talk about good or bad design except with reference to some intended user.

Location 3354-3355

You’re most likely to get good design if the intended users include the designer himself. When you design something for a group that doesn’t include you, it tends to be for people you consider less sophisticated than you, not more sophisticated.