Sam Gentle.com

Names vs IDs

There are a lot of things out there, and to distinguish them from each other it's necessary to have some kind of reference that identifies them. I'd like to make an argument that there are really only two ways to do that: Names and IDs. I believe they are distinctly different, even opposite, and that attempts to mix them in forms like usernames or logins result in name-like IDs that are inelegant, ineffective and user-hostile.

An ID is something that uniquely identifies a resource. It is designed primarily for the use of machines and consequently it is not necessary that it be human-readable or meaningful in any way. A name, by contrast, is a representation of how humans identify things. A thing can have more than one name, a name can refer to different things depending on context, and the right name for something depends on the person.

I first encountered this dichotomy ages back on Freenet, where there were two main kinds of identifiers: Content-Hashed Keys (CHKs) and Keyword-Signed Keys (KSKs). The former are defined by their content, and are thus a perfect kind of ID: that content will always have that ID, and that ID can only ever be assigned to that content. The latter, on the other hand, used keys derived from a simple word like "gpl.txt". That was more like a name, but unfortunately still had some ID-like semantics. A KSK was expected to map to exactly one resource, even though anyone could define one. The predictable thing happened and someone eventually replaced "gpl.txt" with goatse.

Web addresses are another example of names vs IDs. The DNS system maps human-meaningful domain names to machine-meaningful IP addresses. An IP address is fairly ID-like, though not totally unique (the same site can usually be reached through multiple IPs, though there are interesting ideas to change this). However, what really lets down the internet is the structure of domain names.

Domain names, much like Freenet's KSKs, are an attempt to define a human-readable but still unique name-like ID. The result is something that doesn't function well as either. To prevent the GPL-to-goatse problem, we maintain at great expense a global database of domain-to-IP mappings and treat them as purchasable property. The result? Valve, the multi-billion-dollar videogame company and creators of Steam, the biggest online game store, own neither valve.com nor steam.com. In fact, at the time of this reading, neither site has anything on it. What a total failure at representing name-like semantics.

On the other hand, Googling "Steam" or "Valve" will give you the right result. Not only that, but if you're a plumber and you spend all day searching for plumber things, you'll get personalised results that are more likely to suit your plumbery interests. The result is that Google is a better name system than DNS, and user behaviour reflects it. It's very common now to type the name of a website into Google to find it, with sometimes hilarious results.

I believe this is part of the reason for the runaway success of Google. Search, as it's usually defined, is a process of finding information. Queries like "best Beatles album" or "Kim Kardashian's baby" are examples of finding information. But we also use search as a lookup, the same way you would have once looked up a name in a phone book, or a book in a library index. What set Google apart was that it was fast enough, and simple enough, to function not just as a search engine, but as a universal name lookup service.

Other systems can learn a lot from search engines in their name handling. For example, on Facebook you don't really use usernames to refer to people. Instead, you have a user ID that's just a big meaningless string of numbers, and your real-world name. When you want to look someone up, you either have a link to their profile (using their ID) or, more commonly, you just type their name in the search box and they appear. Those names aren't unique, and they're not universal - they change depending on the context and the person doing the searching.

But the thing you have to give up with names is a sense of ownership. If your name is Chris and you join a group with another Chris, there's no trademark dispute resolution process, you just both get called Chris. Or maybe one of you, whoever's better known, gets to be Chris and the other becomes "Other Chris". But these rules aren't designed to protect the primacy of your name as a piece of intellectual property, they're fluid and based around whoever is associated most strongly with the name in a given situation.

Most of our name-like IDs, things like usernames and domain names, are compromises because of weaknesses in computers or human-computer interfaces. In many cases, the computing power and system sophistication just wasn't there in the early days of software to allow for handling names properly. Usernames date back to the unix logins of the 70s, and DNS from the 80s. Back then it would have been not only computationally difficult to do proper name searching, but difficult to build UI for doing name lookups that would be responsive enough. And if your only method for exchanging IDs in the real world is writing them down or saying them out loud, it's important that they be memorable.

However those restrictions are way out of date now, and we have more than enough resources to revisit those compromises. Modern multi-user desktops select a user from a list rather than typing a login. Modern website lookups are mostly done through Google. And I think other name-like IDs will also lose their relevance as we build new systems that supplant the old. The next big frontier is website logins, which a lot of different companies are trying to own.

My hope is that once this particular internet turf war is over and we leave behind our current balkanised mess for a universal notion of identity, we can take on the big dog of ugly name-like IDs: email. Can you imagine if, instead of messaging an arbitrary series of characters, you just message a person? What a triumph of what over how!