Sam Gentle.com

Are categories useful?

I remember reading some time ago about the Netflix Prize, a cool million dollars available to anyone who considerably improved on Netflix's movie recommendation algorithm at the time. Of course, the prize led to all sorts of interesting techniques, but one thing that came out of it was that none of the serious contenders, nor the original algorithm (ie the actual Netflix recommendation engine) used genres, actors, release years or anything like that. They all just relied on raw statistics, of which the category information was a very poor approximation.

So I wonder, if it's true for Netflix, is it true for everything? The DSM-5, effectively the psychiatry bible, had a bit of controversy at least partially because of its rearrangement of diagnostic categories. What was once Asperger's is now low severity autism, and many other categories were split further or otherwise changed. However, the particular validity of a treatment for particular symptoms hasn't changed (or, if it has, not because the words in the book are different now).

Medical diagnostics seems to mostly be a process of naming the disease, and then finding solutions that relate to that name. However, that process can take a long time and doesn't always work. Maybe it would be better if we got rid of the names, and used some kind of predictive statistical model instead. You'd just put as much information is as you can and be told what interventions are most likely to help. The medical landscape would certainly look pretty interesting, but I suspect not in a way that doctors or patients would reassuring, even if it did result in better outcomes.

Ultimately, that seems like the point of categories. They're not good for prediction by comparison to other methods, and often they're plagued by disagreements over whether a particular edge case fits the category or not. However, the alternative would mean putting our faith in pure statistics, and I'm not sure people are ready for that.

Can you imagine a world where we don't categorise things? Where you don't need to determine if something is a chair or not, just whether it's likely you can sit on it? You wouldn't be considered a cat person, just someone statistically likely to be interested in a discussion about feline pet food. Maybe we could all get used to predicting outcomes, rather than needing to understand the internal system that leads to those outcomes. It sure would make life a lot simpler.

But I doubt that's going to happen any time soon.