Noise dictionary

I had an interesting idea today while watching a movie. It's notoriously difficult to compress noise because of its lack of exploitable underlying structure. Unfortunately, certain real-world sounds resemble random noise, like percussion, applause, rain, and even guitar strings. The more they sound like random noise, the harder they are to predict, and the harder it is to compress that audio down effectively. You can sometimes hear this in badly-encoded movie files, where running water sounds garbled.

The essential problem is that you can't compress randomness, so what if we make it non-random? It's quite common to create data that looks random, but follows a predictable pattern if you know the initial seed value. If we created a standard for predictable noise seeds – a kind of noise dictionary – sound effect artists could create sounds using noise sources that sound exactly the same as what they would use otherwise, but are far more predictable. Creators of audio compression formats would be able to use that same dictionary to compress the noise more effectively.

That wouldn't just mean smaller files, it would also mean higher-fidelity reproduction of noise-like sounds, a current blind spot for audio codecs.