Thursday, September 6, 2007

Compression and Focus

I just happened on Greg Chaitin's 2006 Scientific American paper, The Limits of Reason, which the author was kind enough to publish to his website. I first encountered Chaitin in another SciAm article while in graduate school in the early 90s and I found the nascent topic of algorithmic probability so beautiful and profound that I dragged my boss to see Chaitin talk several hundred miles away at University of New Mexico. He kept dramatically grabbing his bald pate during the talk and gradually coated his head with colors from the whiteboard markers he was using. It was immensely distracting but I still got the gist of his arguments.

The odd thing is that I somewhat discovered the same idea of algorithmic complexity as an undergrad, but only in a very limited and intuitive form. I was in a Philosophy of Language class and was worrying over parsing and halting problems, partly informed by all the Frege and Wittgenstein we had been discussing, and partly driven by Godel, Escher, Bach..., the popular text on AI at the time. For my final report, I speculated that an evolutionary algorithm might be able to partially solve the halting problem by guessing when to halt from the productions of the machine and past experiences with other machines. There would be a probability of success at halting, at the very least, I speculated, and the evolutionary algorithm was a general solution.

Anyway, I ended up having to drop by my prof's office to discuss the paper and I recall descending into a discussion about the teleological language we use to discuss evolutionary processes. His face twisted up when I mentioned GEB and we ended with me querying him over whether he didn't like the book because it was popular. I got an A despite myself, I think.

Then, years later, I encounter Chaitin-Kolmogorov complexity and am blown away by the ideas. I trace back into the application to inference and machine learning and discover Ray Solomonoff's work on the topic, published in a series of technical reports at a small research firm called Zator.

Continuing this journey, I encounter Eliot Sober's philosophical treatments that include discussion of Occam's Razor, AIC, BIC and other ways of negotiating inferencing, which leads me to Minimum Description Length by Jorma Rissanen. Soon thereafter I publish a paper on applications of MDL to semantic analysis, showing how compact coding of data streams into trees has similar properties to methods like Latent Semantic Analysis and provides a general way to explain human grammar learning. My boss at the time coins "compression is truth" paralleling Chaitin's "compression in comprehension."

And here, now, Chaitin again adding new spices to this melange as he continues on with his life's work. I envy his focus.

No comments: