Archive for September, 2005

Binary Encoding

Sunday, September 25th, 2005

On Wednesday, I prematurely aborted the discussion of coding. I’ve gotten a couple of emails on this now, the one most recent from Richard.

To remind you, we had a four-symbol code, consisting of H, A, L, and _. If these symbols were equally distributed, we the number of bits required is 2. We know this because the probability of each character is .25. So the number of bits required is

H = - ( .25 log2 .25 +
           .25 log2 .25 +
           .25 log2 .25 +
           .25 log2 .25 )

or 2.

It’s pretty easy to figure out how to code it into 2 bits:

H - 00
A - 01
L - 10
_ - 11

But, it is not true that (for example) in an English text the probability of E and Z occurring in a string is equal. E appears much more frequently. So, let’s say we observed a pattern that looked like

HAHHHLA_HHAHH_HALH_AHHH_HAHLHAHALH_HALAH

In practice, of course, you would want to have a lot more to go on, but let’s say this was enough to assume it was representative of the language. In this case, of 40 total characters, 20 are H, 10 are A, 5 are L, and 5 are _. Now, our probabilities are different, and so the number of bits required are also different:

H = - ( .5 log2 .5 +
           .25 log2 .25 +
           .125 log2 .125 +
           .125 log2 .125 )

Which ends up as 1.75 bits. How do you fit four symbols into 1.75 bits? Well, the trick is that a message fits in–on average–1.75 bits per character. That means that at least one character needs to fit into a single bit. In our case, the bit we would choose is clear: the H. H appears far more frequently than the other characters, and so it can be compressed more.

If I recall, a first attempt at this was this:

H 0
A 01
L 001
_ 0001

The problem here is that if H is 0 it can’t be a part of A. Otherwise, try to decode this:

0010001

It could be decoded in a few ways:

HAHHA
LHA
etc.

And that is no good at all.

Consider this instead:

H 0
A 10
L 110
_ 111

So to encode HAH_LAH you would end up with

0100111110100

You will see that when you chain these together (and you know where to start), you have only a single way of decoding.

Consider that chain.

We start with a 0 which can only be an H. It can’t be anything else, because H is the only character to start with a 0.

So we have H.

The second bit is a 1. Actually, this could be A, L, or _, so we’ll keep going.

The third bit is a 0, meaning we are up to 10. There is only one character that starts 10, so we know it is an A. We have HA, so far.

The fourth bit is a 0. Again, only an H starts with a 0, so we now have HAH.

The fifth bit is a 1. There are still three possible characters that start with a 1 bit: A, L or _. So we need to look at the next one.

The sixth bit is a 1. There are two possible characters that start with 11: L and _. So we need to go to yet another bit to figure it out.

The seventh bit is 1. So we have 111 and _ is the only possibility.

You get the idea.

Another way to think about the code is as a binary tree:

  0 /\ 1
    /  \
 H  0 /\ 1
       /  \
     A  0 /\ 1
           /  \
         L     _

This idea most directly important to schemes for efficiently encoding data; but it turns out that encoding data, as we have been reading, may have importance beyond digital computing machines and communication systems. It may be the stuff that the universe and life is made of.

The next step, which we did not get to, is to look at larger patterns. If we can save a quarter-bit by looking at single letters, what if we go to pairs? Q is followed by U far more often than by any other character. Therefore, the amount of information required to encode the pair is reduced. We can find similar regularities in bunches of 3, 4, 5, and 5000 characters (though we have diminishing returns for encoding).

Remember, Morse’s original idea was to encode whole messages to a particular number. This would be far more efficient than Morse code, and was a coding scheme used by the ancient Greeks (among others). He then thought about a dictionary, with a number for each word in the dictionary. This too would be a more efficient coding, though it would require more work on either side to encode/decode.

Finally, I was ambitious and hoped that we would get to talk a bit about finite state machines. The tree graph above shows how it would be easy to create a simple finite state machine to decode the string of 1s and 0s and form a message. Unfortunately, some portion of the class required time to grok. I do encourage you to work through some of these issues on your own, and for those who are reading from the Com Theory class, we’ll go beyond this a bit to talk about evolutionary models of finite state machines.

Job blogging

Saturday, September 24th, 2005

I just erased a long entry about why I was leaving UB, and what I am looking for in a new job. It took a while to write, and it was probably good for me to write it out. But it was also probably also too honest to do me any good. Transparency has its limits, and blogging about the reasons you are leaving your current position are probably among those limits. Besides, I wouldn’t want a future employer to think I was willing to air the department’s dirty laundry. Like most academics, I am convinced that the petty politics of the university would make an excellent novel. And like most, I’m smart enough not to write that novel.

Basically, I want to work in a place full of people who remind me of who I want to be: people who are creative, who challenge the status quo, and who care passionately about the world they live in and the work they do. I think it is fair to say that while I respect my colleagues here at UB, I don’t feel like they get excited about the kinds of things I get excited about. I know that there are people out there who do — I’ve been lucky enough to meet many of them, in part through this blog. All I’m looking for now is a place where they congregate, and a home for me to do exciting work. I am looking for fertile soil to land on. So far, I haven’t seen a lot of promising landing spots.

Contrary to what I had hoped, I will have to be circumspect in my blogging of the process. My new blogging rule has become “blog about what you are doing, not what you are going to do.” So what I am doing right now, is trying to figure out what sort of place would be somewhere I could grow and find my contributions appreciated. Oh, that and seeking out some letters of recommendation for a couple of academic postings that look promising. Once I know more for sure, I’ll be sure to share it with you all.

In the meantime, if you have ideas, let me know. Generally, I am looking for something that is in New York or within an hour or two of commute time (or largely telecommutable). If it is something really cool, I might be willing to break that barrier. It need not be an academic posting; I am certainly interested in the possibility of working in a more entrepreneurial atmosphere on either side of the ivory curtain. It needs to be something that represents a shifting challenge, and that engages me in some form of problem solving, because, frankly, boredom is my greatest enemy. And it needs to be with people who are passionate about changing the world, and able to do so. Is that too much to ask?

Blogs/Wiki Workshop

Friday, September 23rd, 2005

Not a lot of people are local readers, but if you happen to be one, I’m doing a workshop for the Educational Technology Center at the University at Buffalo on Weblogs and Wikis this coming Wednesday (9/28) from 10am to noon. Register (for free!) at the ETC site. I’ll post related materials as soon as… um… I make them.

Generalized Video

Friday, September 23rd, 2005

As I got on my short Jet Blue flight home Wednesday night, nearly every monitor was re-(re-re-) playing the landing of the stuck-gear jet at LAX. I had been in class, and not following the event. I mentioned to the flight-crew that having a “crash landing” Jet Blue jet on the displays was probably not the best idea if they had any nervous fliers, but they (the crew) were all too glued to the set, and those who weren’t were busily calling people to let them know that they were not doing the cross-country run.

Photo of Jet Blue landing; via Gawker

(I guess flight crews opt for a variety of runs, but the cross-country one is better for them because they only get paid for time in the air. The short Buffalo-NYC run — which they might make several times a day — basically pays half as much per hour because of the time spent on the ground in each city. However it works out, crews on our short flight might be going to Miami or LA on another day, and they wanted to let everyone know that they hadn’t this time.)

What I hadn’t thought about as much, until noted by Earth Wide Moth, was what one passenger called the surreal experience of watching the landing unfold via the seat-back monitors from inside the plane itself. There is something slightly uncanny about that experience I suppose, but also something fairly emblematic. It’s not unique: there is a standard cliche that shows up as a comedic moment when people are watching the news at home and seeing a house surrounded, only to slowly discover that it is their own. But it seems that situation is creeping outward.

The other extreme might be what appears in “Strange Days.” In the film, “users” (and the relationship to drug users is played out) of a device called a SQUID (Super-conducting QUantum Interface Device - basically a bunch of electrodes placed on the head that allow direct access to the sensory portions of the brain) trade experiences that have been recorded by others. The character played by Ralph Fiennes discovers a trade in snuff recordings. At one point, a character is forced to watch her own death, from the eyes of her killer, and this is recorded for later viewing.

Camera phones are already near ubiquitous, and those capable of streaming video are fairly widespread. I’m waiting for the other shoe to drop. The “surreality” of the Jet Blue experience was due to the simple fact that a video screen with network access was in front of passengers. But we are already approaching a time when that is norm rather than the exception. What will it mean when mirrors are everywhere? When everything we do will be judged by how it is captured and viewed by ourselves-as-others?

I am reminded of someone who, when it rained on her outdoor wedding, had the guests come back a second day to capture the wedding as she had envisioned it. They went through an empty ceremony, as a simulation of the way the “real” wedding should have happened, while the videographer recorded their perfect wedding.

Of course, maybe we always see ourselves through the eyes of, to use Mead’s phrase, the “generalized other.” But by making that view transparent, does it mean we can more easily step into the shoes of the other? Or does it mean a new era of narcissism, when we no longer need to empathize to understand what others see, we need only turn on our TV?

Avast, Waisters!

Monday, September 19th, 2005

Ahoy, me buckos! I be a mite late on the orders for the weeks to come. Arrrrrr… Here be a few tales, to be followed by more come high water, or to be sure I’ll be feeling rope’s end from the lot of ye’. (Aye, it is International Talk Like a Pirate Day.)

Pirate Flag
September 28 – Structural functionalism and its critique
* Lasswell, H. D. (1953). The structure and function of communication in society. In L. Bryson (Ed.), The communication of ideas. New York: Harper & Co.
* Mills, All, but pay special attention to c. 1-4, and read the appendix.
6:00pm: Axel Bruns on “prosuming.”

5 – Direct Effects
* Lowery & Defleur, c. 1, 2, 3, 4, 7, 8
* Cantril, H. with Gaudet, H., & Herzog, H. (1940). The invasion from mars: A study in the psychology of panic. Princeton: Princeton University Press. Selection. (Optional: listen to broadcast.)
* Ellul, J. (1965). Propaganda. New York: Vintage. Selection.

October 12 – Limited Effects
* McCombs, M., & Shaw, D. L. (1972). The agenda-setting function of mass media. Public Opinion Quarterly, 36, 176-187.
* Gerbner, G., & Gross, L. (1976). Living with television: The violence profile. Journal of Communication, 26, 172-199.
* Iyengar, S., & Kinder, D. (1987). The priming effect. News that matters: Television and American opinion. Chicago: University of Chicago Press.
* Katz, E., Blumler, J., & Gurevitch, M. (1974). Uses of mass communication by the individual. In W.P. Davison, & F.T.C. Yu (Eds.), Mass communication research: Major issues and future directions (pp. 11-35). New York: Praeger.
* Lowery & Defleur, c. 5, 11-14

October 19– Conceptualizing Information
* Campbell, all.

Next couple weeks

Monday, September 19th, 2005

I’ll have a few more weeks up later in the week. This coming week provides an example of why I wanted some flexibility. While we will talk about Baeyer’s definition of information and its place in science and society this week, we are going to put off the idea of information more common to information sciences for a week. Why? Because the esteemed Dr. Axel Bruns is giving a presentation next week on “prosuming” that — while it comes at the wrong time of the semester — is germane to the class. Also, an online conference starts today that discusses the impact of the new “Real ID” legislation. Again, we are not talking about social issues as much until the second half of the course, but this is worth taking the time out now to discuss.

September 21 – Information as a concept
* Baeyer, chapters 1-18.

September 28 – Special Week: REAL ID conference and Axel Bruns presentation
* Follow (or engage in) the discussion at an online conference on Real ID, from September 19 through 23
* NPR on Real ID
* UPI Article on Real ID
* CNET FAQ on Real ID

You are encouraged to attend a presentation by Axel Bruns: 6pm, CFA, DMS 235. We will begin class after his presentation, most likely at about 7:30 and discuss both issues raised in his talk and in the REAL ID conference.

* Abstract for the Bruns talk
* Boston Globe, “Are you a prosumer?”
* Wiipedia: Prosumer

Be sure to post your opinions and reactions to the items above.

October 6 – Information as Thing
* Bates, M. J. (1999). The invisible substrate of information science. Journal of the American Society for Information Science, 50(12), 1043-1050.
* Buckland, M. (1991). Information as thing. Journal of the American Society for Information Science, 42(5), 351-360.
* Kidd, A. (1994). The marks are on the knowledge worker. Proceedings of the SIGCHI conference on Human factors in computing systems: celebrating interdependence. Boston.

[UPDATE: fixed date.]

It’s too damn hot.

Tuesday, September 13th, 2005

Weather.com says it is only 85 degrees out, but whether because I am tired or because of the heat, I can’t seem to think. When we moved into our place, we didn’t put in air conditioners (the building is c. 1928, so no central air) because, (a) it’s September (!); (b) having paid to move, and for our broker’s upcoming birth, and for staying in the city to find a place, we are not only hugely in debt (nothing new there), but also have choked our cash flow; and (c) in order to put in a window air conditioner, we would have to get an approved, bonded installer, with special insurance, yada-yada, rather than just dropping our existing cheapo window units in. And so, no work for me. This, by the way, is why I initially chose working in Buffalo over trying for a position in Singapore: my brain shuts down when not within the 65-71 degree range.