Programmers aren't programs

Yesterday I mentioned one reason why project estimates are often wrong, but I skimmed over why software development in particular has trouble with estimation. Even with a genuine effort to predict, rather than control, the amount of time the project will take, software is genuinely kind of unpredictable. I believe this isn't just a coincidence, but something that is fundamental to the nature of software development. It's because programmers aren't programs.

We used to pay people to machine things, now we mostly get robots to do it. Partly, this is because the actions involved were easy to mechanise, but also because the decisions involved were repetitive enough to turn into a computer program. Driving has much more difficult mechanics, so the sensors and actuators have to be more complex, but the decisions of human drivers are still repetitive enough, predictable enough to be turned into a program. The service industry is more complex still, but I expect it is next.

Many fields are part creative and part uncreative. Architects and engineers design a bridge, builders build it. Building is repetitive and predictable, so it is the easiest to automate. As for engineers, increasingly advanced modeling and simulation systems reduce their work, but will never completely eliminate it. We can make machines solve problems, but we can't tell them which problems to solve. If someone thinks up a new kind of bridge, that work stops being repetitive until such time as we know how to do it reliably, repetitively, and predictably. Then it will be automated away again.

Software isn't an exception to this, quite the opposite; software is the abstraction over all of these examples. It's not that software can't be automated, it's that software is always being automated. Programmers automate away their own jobs every day, then make new jobs for themselves, then automate those away too. They ride the wave of whatever parts of the system have no recognisable pattern, but as soon as that pattern appears, it becomes part of the program too and the programmer moves on to something else.

This is why the field has moved so quickly. Programmers spent a lot of time in the early years figuring out how to make lines move on a screen, connect one computer to another, or store stuff and read it back later. But now we take all of that for granted; it's boring, repetitive, predictable. In other words, those problems have already been written into programs, and most programming work is now in using those programs to solve other problems that still don't have a predictable pattern.

Of course, that's not always true. Despite these lofty ideals, programming still has a lot of busywork. But although that's the case on a small scale, the industry as a whole tends to be on a constant mission to make itself redundant; today's programming becomes tomorrow's programs. And, while some programmers who really just write the same code over and over again may be automated out of a job, for many programmers the day their current problem is automated away will just be a regular day.

This is why programming is hard to predict, because prediction requries a pattern. You can predict how long something takes to build if you've made a hundred things just like it. But anything that is easy to predict no longer needs a programmer, it just needs a program. Once it's predictable enough to give way to estimation, it's predictable enough to stop doing. Programming is whatever part is still too complex to predict. And that's why software estimation is hard.

Prediction vs control

Imagine you're an old timey rainmaker. You know that people can't control the weather, but lots of suckers think differently. How can you use this to your advantage? Well, maybe you can't control the weather, but you can predict it. So you go to towns that have had a long drought and, according to your predictions, are due for rain. You do your rain dance, it starts to rain, and you cash in. This is thought to be the method used by famed rainmaker Charles Hatfield, for whom it worked so well that he was once accused of flooding San Diego.

This similarity between prediction and control is something I've been thinking about, especially in reference to the idea that you can control the outcome you get or how long it takes but not both. I'm sure some people would disagree, saying that for well understood problems you can get a certain outcome within a certain time. But really, this is just confusing prediction and control. You can predict you will get the outcome by the time, but you can't control it. The causality only goes one way.

Project estimates are a great example of this. Why are they wrong so much, especially in software development? In theory, they're a pure exercise in science, just like predicting the weather. But, as in management consulting, people ask questions that they want particular answers to, making it less a question than a request for justification. I think estimates are often wrong because the estimators are often, implicitly or explicitly, asked to be wrong. The hope is that changing the prediction will change the outcome. But that's not how prediction works.

And you can tell whether your estimates are really attempts at control by what happens when they're wrong. Predictions can be wrong, but control can't. If your estimate was that the project would take a month, but it's looking like it will take longer, what happens? Do you change the estimate or are you expected to somehow make it happen anyway? If you have to make the estimate come true, then it's really just an order with a thin veneer of science. Control, not prediction.

Prototype wrapup #34

Last time I did some home automation prototypes, and this week was more practical stuff. It's quite nice to mess around with code that solves an actual problem you have.

Monday

I wanted to make something that could connect to my router and tell me its bandwidth usage. The router doesn't support UPnP (the way I would normally do this) but it does use some silly SOAP-based API called HNAP. The whole thing was pretty impenetrable until I found a related library for HNAP-controlled smart plugs. At that point I managed to get it working well enough to restart the router and called it a day. Later on I discovered I could enable UPnP on the router and expunged HNAP from my memory entirely.

Anti-protagonism

The famous trolley problems build up through a series of "x dies or y dies" type questions to the ultimate gotcha: if you would switch a train to hit one person instead of five, would you instead push a fat man in front of a train to stop it? The two situations are, in terms of pure utility, equivalent. Yet people are far happier to pull the switch than push the fat man. There are lots of theories as to why, but here's mine: it's about protagonism.

Video games often talk about player and non-player characters (NPCs), similar films have main characters vs extras and so on. It's a common trope, in part because I believe this is how we see the world. We look at the world and see an inert background layer, over which are a quite small number of foreground objects, usually chosen by how much they move and/or are likely to harm us. Our object recognition is so powerful it informs the entire way we see the world; even people are categorised as foreground or background.

So what about trolley problems? Well, the whole point of background characters is that they don't have agency, and thus no moral responsibility. When you're the poor schmuck stuck in the signal box, having to choose how to make the best of a bad situation, your part is played by Jonah Hill. When you step up to shove the man off the bridge, you're Matt Damon. And in its most extreme form, the doctor who secretly kills a healthy man to save the lives of five sick ones, you get to be a deeply troubled Leonardo DiCaprio. The more your actions reflect those of a protagonist, the more the situation is dictated by your actions than by external circumstance, the more harshly you are judged.

The problem with this is, well, it doesn't really make much sense. We clearly judge protagonists more because they are easier to judge, easier to think about, and tend to catch our attention. But the world isn't really divided into foreground and background characters, and trolley problems are, in some sense, the least useful moral dilemmas to consider. What about all the millions of lives not saved because people have other things to do? More people have been killed by indifference than the actions of all the world's murderers. But because inaction is background stuff, we don't get upset about it.

Worse still, the net effect of all this is a kind of anti-protagonism. There is far more moral risk for actors than non-actors, which leads to a strong motivation not to act. It means worse outcomes for moral actors, but also better outcomes for amoral actors, who don't care about the moral risk and also benefit from having fewer moral actors to oppose them. Worst of all, in bystander situations, or really any situation where inaction is pathological, we all suffer due to anti-protagonism.

All of which isn't to say that we shouldn't moralise about protagonists; they take actions, actions have moral consequences. But if we do so in the absence of moralising about inaction, we build a system where only one kind of person has to worry about moral consequences, and why would you want that to be you if you could avoid it?

Noreply

There's an interesting pattern I've noticed, especially among people who receive a lot of email. They say "I try to read all the emails I get, but I don't get time to respond to all of them individually". It seems like a relatively natural response to a high volume of email. After all, you still want the nice experience of reading fan mail, but it's just infeasible to respond to everyone.

The problem with email is that we use it for two opposite things: receiving information and requests for information. A lot of emails we get require no concrete action at all, and those that do are often things like meeting invites that only require actions outside of the system. But the other kind of email is conversational, and actually expects a reply. In fact, often these emails have little content other than a request for you to send an email back. These are two totally different patterns, and mixing them up is hazardous.

Unfortunately, the fact that we use the same system for both of them often leads to mixups. We reply to emails that don't need a reply (I've previously worked with people who would always reply "thanks" to any email, and I really wanted to see what would happen if two of them ended up replying to each other). We sometimes don't reply to emails that were intended to be replied to, or miss them entirely because of all the other email. And worst of all, we turn emails that could have been purely informational into ones with requests in them just because we feel like we should.

To untangle all of this, I'd suggest something pulled from the big corp playbook: have a "noreply" address. Any email that goes to that address will never be replied to, but is much more likely to get read. The other address (your "reply" address) is for stuff that you will definitely reply to if you have read. Because of the extra load inherent in replying to emails, those ones will get read much more slowly, if at all. But the nice thing is that if you haven't recieved a reply, you know it's because the email hasn't been read yet, not because it's been read and discarded.

I don't currently get enough email for this to be seriously beneficial, so I've no way of testing it, but even so it's interesting to think about ways to manage the problems of overloading email with so many different kinds of messages.