Gish

While thinking about Merkle versions I realised that there's no easy and commonly accepted way to hash a directory. I've actually had this problem before and I ended up doing some awful thing with making a tarball and then hashing that, but then it turned out that tar files have all sorts of arbitrary timestamps and different headers on different platforms, which made the whole thing a nightmare.

Since I suggested git tree hashing would be a good choice, I thought I'd put my money where my mouth is. It turns out that git doesn't expose its directory tree hashing directly, so you have to actually put the directory into a dummy git store to make it work. That all seemed too hard for most people to use, so I made Gish, which is a reimplementation of git's tree hashing in nodejs.

It ended up being one of those "this should only take an hour oh god where did my afternoon go"-type projects, but I'm happy with it all the same. Hopefully it proves useful to someone and, even if not, I know a whole lot more about git trees than I used to.

Merkle versions

I've become a big fan of semantic versioning since its introduction. The central idea is that versions should be well-defined and based on the public API of the project, rather than arbitrary feelings about whether a certain change is major or not. It also recognises the increasingly prominent role of automated systems (dependency management, build systems, CI/testing etc) in software, and that they rely much more than puny humans do on meaningful semantic distinctions between software versions.

But one thing that can be troublesome is being able to depend on the exact contents of a package. Although it's considered bad form and forbidden by some package managers, an author could change the contents of a package without changing its version. Worse still, it's possible that the source you fetch the package from may have been compromised in some way. What would be nice is to have some way of specifying the exact data you expect to go along with that version.

My proposed solution is to include a hash of that data in the version itself. So instead of 1.2.3 we can have 1.2.3+abcdef123456 That hash would need to be a Merkle tree of some kind, so as to recursively verify the entire directory tree. I couldn't find any particular standard for hashing directories, but I suggest git's trees as being one in fairly widespread use. You can find out the git tree hash of a given commit with git show -s --format=%T <commit>.

Two interesting things about this idea: firstly, the semver spec already allows a.b.c+hash as a valid version string, so no spec changes are required. Secondly, because the hash can be deterministically calculated from the data itself, you don't actually need package authors to use it for it to be useful! You could simply update your package installer or build system to check your specified Merkle version against the file contents directly, whether or not it appears in the package's actual version number.

It's funny, I never thought of versioning as something that would see much innovation, but I guess on reflection it's just another kind of names vs ids situation. I wonder if there will be a new place for human-centered numbering once it's been evicted from our cold, emotionless version strings.

Imaginary

Certain colours – magenta, for example – are not real in the physical sense. That is, there is no magenta wavelength. In fact, everything on the colour wheel between red and violet, which is all of the purples, only exists in our heads. There's every reason to think that if aliens appeared and we showed them purple, they would say "that's just red and blue!" and laugh at us.

Purple exists because we have our own mental colour system, which is an imperfect mapping of the physical colour system. And this doesn't just mean that we see colours wrong sometimes, or that there are certain colours we can't see, but that there are also colours we can see that never existed at all: imaginary colours. But all of our mappings have this same property; there are certain characteristic edge cases that can lead to imaginary results.

There are a lot of theories for why celebrities often seem to suffer from depression, addiction and public meltdowns. One possibly too easy answer is that we would all act out if we could, but regular people don't have the resources. I'd like to suggest an alternative: empathy. When we see people doing things, we use our empathic system to recreate that feeling in our own minds. But much like colour vision, empathy imperfectly maps the external to the internal.

We sometimes misinterpret feelings, and sometimes feel nothing in a situation where we should have empathy. Is it possible there could be certain imaginary feelings that do not exist except when we feel them second-hand in someone else? I believe so, and I believe one such feeling is fame, or success. The feeling of "now I've made it; I'm here; I did it; I'm great now". We feel this feeling in others, but I don't believe we feel it in ourselves.

So what could be more destabilising than being driven by fame? You see celebrities and successful people and long to feel like they feel. But, of course, you don't know how they feel. And some day, by luck or hard work, you end up like them – and the feeling's not there. What do you do next? Where do you turn if the thing you've been looking for turns out to be an illusion?

SDR

I've been messing around with RTL-SDR lately, which is what led to the ATC feed you see above. I'm pretty impressed with how much you can get done with nothing but a $20 TV tuner and some software. As well as air traffic, I've had some fun moments being reminded that there's still a non-internet radio service, reading pager messages, and listening to the hilarious hijinks of taxi dispatchers.

There's some super serious signal processing stuff you can do using gnuradio, up to and including communicating with recently-resurrected space probes. But most of the software available seems geared for that kind of heavy duty signal processing, with not much in the way of resources for the casual spectrum-surfing enthusiast. The software above is CubicSDR, which is great, but currently limited to analogue FM/AM signals.

It occurs to me that this would be a great area to inflict some hybrid web development on. You could have a nice modular backend in a fast language like C or Go to do the signal processing, and feed that into a JS+HTML frontend. The modularity would make it easy to add new decoding components for things like digital radio, TV and so on, and the HTML frontend would make it easy to create and iterate on different ways to visualise the signals.

Plus, being web-compatible would give you a lot of cool internet things that are currently pretty difficult. For example, an integrated "what is this signal and how do I decode it" database, or Google Map of received location data. The last piece of the picture is that a sufficiently advanced web UI would solve the cross-platform division that's currently making my life more difficult than it needs to be.

I'm really excited about the potential of SDR. The software is currently just a little bit too awkward to be suitable for general use, but it's so close! Most of the individual components are there, it's just missing a bit of glue, sanding and polish.

The outlier problem

I was really saddened when I learned about Steve Jobs's death, not least of which because of the circumstances leading up to it. Jobs had pancreatic cancer, normally an instant death sentence, but in his case he had an exceptionally rare operable form. However, Jobs elected not to have surgery, hoping he could cure the cancer with diet and meditation. Unfortunately, he could not, and by the time he returned to the surgical option it was too late.

But the real tragedy isn't just that Jobs died from something that may have been prevented, it's that he died from the very thing that brought him success in the first place: hubris. Jobs had made a habit throughout his career of ignoring people who told him things were impossible, and that's not a habit that normally works out very well. For him, improbably, it worked – very well, in fact – until one day it didn't work any more. This is the essence of what I call the outlier problem.

We often celebrate outliers, at least when they outlie in positive ways. Elite athletes, gifted thinkers, people of genetically improbable beauty. The view from here, huddled in the fat end of the bell curve, gazing up at the truly exceptional, makes them seem like gods. But it's worth remembering that we are clustered around this mean for a reason: it's a good mean. This mean has carried us through generations of starvation, war, exile and death, and we're still here.

It's important not to forget that an exceptional quality is a mutation, no different than webbed toes, double joints, or light skin. Sometimes being an outlier lets you get one up on the people around you and start a successful computer empire. Sometimes it lets you remake the music industry, the phone industry, and the software industry in successive years. And sometimes it means you die from treatable cancer.

I remember Steve Jobs, not as a genius or an idiot, but as a specialist: perfectly adapted to one environment and tragically maladapted to another.