I gave an oral exam in my graduate theory seminar yesterday, which included a very simple question (that most of the class, shockingly, got wrong) asking students to figure out how many bits were required to represent a particular piece of information. At lunch today, I asked my wife (the law student) what the answer was, and she got it immediately, which got us talking about encoding information.
How is it that when you “zip” a file, it gets smaller? Assuming that you are smooshing something in such a way that you get back the exact original (lossless compression), you look for regularities, and represent them in some way that eats up less space. If, for example, I have ten spaces in a row, it takes up less room if I store “ten spaces” than if I try to store “space space space space space space space space space space.” In other words, although there may be an absolute minimum needed to transmit a certain amount of information, there are ways to compress regularities.
Those regularities may be inherent to the content of the message being sent. The “space” example is a good one. Spaces are encoded as a particular binary pattern, and if this pattern repeats ten times in a row, there is a way to represent this. The same can be said of blocks of color in an image, for example. Or, if you notice that the word “the” shows up a lot in English text — even if you have no idea of what “the” means — you might choose to give it its own code. This can be worked up to several levels.
But this hints at the second type of compression: compression that relies upon a history of communication. Rather than repeating everything you need to know, I can rely upon earlier information sharing, and say “what I said last week” or “let’s go to plan B.” Or, famously, I can just decide to hang up a single lantern or two, to show whether an attack is by land or sea. In this case, the world of possible messages is pretty much contained (no chance of an air attack), and so there really is just 1 bit of information (or, arguably, 1.58 bits) needed to transmit the message. One could imagine that after numbering all of the words in English, as well as some short phrases, it would be possible to compress English even further, due to redundancies and pattern regularities. On the assumption that we rarely say much that is new, we could even do word pairs or word triples, encoded by their probability of occurring, and easily store these on our large hard drives to reduce the size of communicated texts. This might begin to make sense if you are talking about storing, for example, the Library of Congress. Citation of earlier work is, in some strange sense, compression. So is a hyperlink. So are words.
When you remove internal regularities, to the greatest degree possible, the resulting output should contain no regularities at all. It should be random. After all, if there is a regularity, it is redundant. Now, there are very good reasons to be redundant. Redundancy allows you better chances to overcome a noisy channel. Although “F” is easier and quicker to say over a noisy radio than “foxtrot,” it is also much easier to confuse “F” with “S”, or even potentially “M”, “N”, or “X”. Computer communications often throw in a bit or two to make sure that everything “adds up” at the other end. Heck, even DNA may have error correcting codes. So, there may be an internal regularity that is added specifically to overcome noise.
Which brings us to SETI. Now, let me begin by saying that I have no idea what SETI really does, I have zero background in signal processing, and I am in this area (as in many) an ignoramus. That’s why I’m blogging about it! But I think that many people casually assume that SETI is aiming to “eavesdrop” on alien TV programs, Kang and Kodos style. SETI is pretty clear that’s not really what they are about. They are, instead, looking for an alien transmission that is intended for us, or other aliens like us.
So do we (or they) create a message that is the least random as possible? This would be a message that says little more than “we exist as an intelligent being.” What sort of a message might that be? Well, a really obvious one, because it is redundant, is something like “space space space space space” or maybe a binary equivalent of “11111111111”. Depending on what pattern you assign to that 1, it may be better to do something like “1010101010101010”. But the trick is, we aren’t sure what pattern they might generate. It could be just about anything. So we are forced to look for any redundant message.
Only, redundancy is in the eye of the beholder. Is the rotation of a star (on-off-on-off) an intelligent sort of thing? Well, no. It’s not complex enough to be thought of as intelligent. So what will aliens think if we just shoot off a rotating laser beacon, a flashing “Eat at Earth” sign? No, we want something both complex and redundant. Like Mozart. Or fancypants math.
Mozart has a lot of redundancy, or at least his music does. It tends to restrict itself to a handful of frequencies, and return to pasterns of those frequencies regularly. It keeps to a particular period, or some multiple of that period. So maybe Mozart — in a raw and uncompressed form — is a good test for intelligent life. Ah, nice work: no bird-brained alien can come up with that melody!
But what if that level of redundancy just doesn’t resonate with an alien intelligence? What if it is so regularized that they assume that it is a natural phenomenon? The idea, of course, is not to come up with a “supernatural” signal. No such thing. We are natural. We’re just smarter than whales. And we want to find other creatures in the universe that are smarter than whales, but hopefully not much smarter than us, because otherwise they will think us dreadfully boring creatures that might be fun to eat. You know, like whales. But the trick is, we don’t really know what we mean by “intelligent.”
I have a feeling one thing we mean by intelligent is able to converse with us. That is, if we suddenly got a signal from space that, when placed on a 1000 x 1000 grid was a very clear representation of a circle, well, we would be set. Whales don’t digitize circles. That’s a “higher order thinking” sort of thing; a pure math thing. It’s also something we would expect to be extremely unlikely to occur naturally. The trick is, would people 150 years ago have realized what to do with that information? And if we can’t even prove our intelligence two ourselves, six generations removed, then we have some real problems demonstrating our “intelligence” to creatures from another world. We have a shot at recognizing socialized humans, maybe, but why are we so convinced that other intelligent creatures will think like we do, when our thought has ontogenetically and phylogenetically been shaped by a very particular environment? And what if they are mathematically illiterate? Are they no longer worth talking to?
The real proof of intelligence remains the Turing test. I have a feeling that a one-way Turing test is what these folks have in mind. When we play them Mozart, they are intelligent if they recognize it as intelligent. We are on, to some extent, the same wavelength. There are computer programs that generate symphonies — symphonies that may well register as “intelligent” to many listeners — but in any event, these are far more complex (perhaps) than the simple calls of animals or naturally occurring songs. We made the symphony, even if we employed complex tools to create it.
What we really need, to determine whether a message is intelligent, is to see both an input and an output; a processing that suggests learning. No single message, no matter how complex will self-encode enough information to be meaningful on its own. Instead, we need the back and forth of conversation. And I am guessing that by the time we can converse, we will no longer be in a position of guessing whether we have found alien life. We’ll know it when we see it.