Sunday, July 29, 2007

Semantics and Sonny Bono


"Bono and the tree became one"

That sentence has been an object of scrutiny for me over the past several weeks. It is short enough and the meaning seems fairly easy to digest: Sonny Bono died in a skiing accident. It might have shown up in a blog back when the event transpired, or in casual conversation around the same time.

So what is so fascinating about it? It is the range of semantic tools that are needed to resolve Bono to Sonny Bono and not to U2's Bono or any of the thousands of other Bonos that likely exist. First, we need background knowledge that Sonny Bono died in a skiing accident. Next we need either the specific knowledge that a tree was involved or the inference that skiing accidents sometimes involve trees. Finally, we need a choice preference that rates notable people as more likely to be the object of the discussion than everyday folk.

We could still be wrong, of course. The statement might be about Frank Bono, a guy from down the street who likes to commune with nature. It might be, but for a statement in isolation the notability preference serves a de facto role as a disambiguator.

How, then, can we design technology to correctly assign the correct referent to occurrences like Bono in the text above? We have several choices and the choices overlap to varying degrees. We could, for instance, collect together all of the contexts that contain the term Bono (with or without Sonny), label them as to their referent, and try to infer statistical models that use the term context to partition our choices. This could be as simple as using a feature vector of counts of terms that co-occur with Bono and then looking at the vector distance between a new context vector (formed from the sentence above) with the existing assignments.

We could also try to create a model that recreates our selection preferences and the skiing <-> tree relationship and does some matching combined with some inferencing to try to identify
the correct referent. That is fairly tricky to do over the vast sea of possible names, but is easy enough for a single one, like Bono.

It turns out all of these approaches have been tried, as well as interesting hybridizations of them. For instance, express the notability preference as a probability weighting based on web search mentions, while adding-in the distance between different concepts in a tree-based ontology, trying to exploit human-created semantic knowledge to assist in the process. It turns out that fairly simple statistics do pretty well over large sets of names (just choose the most likely assignment all the time), but don't really capture the kinds of semantic processing that we believe we undertake in our own "folk psychologies" as described above.

Still, I see the limited success of knowledge resources as an opportunity rather than a source of discouragement. We definitely have not exhausted the well.

Saturday, July 21, 2007

Fiber Optics and Amateur Access

An elderly woman in Sweden got a 40Gb/s fiber optic pipe installed to her home, recently. She hardly uses the web but can now download a feature length film in 2 seconds. I was lamenting the death of satellite and cable TV when we all have fiber to the house with those kinds of bandwidths over lunch today. It came up because my neighbor dropped by for a drink the other night and ended up staying until 1 AM, sucking down my gin and complaining about the 9th Circuit Court of Appeals. He called home at some point and apparently interrupted his wife's enjoyment of The Closer while also missing dinner.

It was the fact that she missed her TV show that struck me. I don't have that problem. I didn't watch much TV beyond the news, Frontline or some random late-night sitcom until recently when we upgraded everything. Now I have a DVR and actually watch some programs (including The Closer) but only because I can comfortably time-shift and pause TV as I see fit, and all in HD where available.

But what happens with full on-demand TV? The satellites will not be de-orbited for some time as they will continue to serve remote areas, but eventually they will go away. Even the notion of networks and channels would dry up over time. Channels are a delivery mechanism for content that are only useful as branding labels in an on-demand universe. Studios can equally well disintermediate their content and swing deals directly with advertising clearinghouses. This is already happening somewhat in the online video space, but the bandwidth and quality issues remain a stumbling block until that fiber optic pipe arrives.

Now, suddenly, without the channels to filter content choices down to a few hundred options (sheepishly, I have a few hundred channels; George Chadwick's Aphrodite is playing via SIRIUS Symphony Hall through Dish Network right now, blurring the lines between mediums) we will instead start using other mechanisms to make content choices. There will be individual critic lists, popularity recommendation engines and, most importantly, content cross-advertising to try to attract eyeballs. The amateur will mix with the pro as technological and artistic means for producing amazing content becomes increasingly inexpensive.

Friday, July 13, 2007

Framing and Dissonance

Finally, and with little fanfare, I closed on my final report for my most recent grant this evening. The champagne rests in the fridge for the moment. The last two weeks have been, well, consuming. Now I rush headlong toward a second phase.

While writing and experimenting, I was occasionally drawn into blog and editorial discussions, some of which were mildly amusing. I even learned some new things, though not directly from the blogs, I'm afraid. Specifically, the topic of semantic "framing" came up during a cross-Wikipedia excursion in pursuit of a recollection about Newspeak driven by Christopher Hitchens' discussion of cognitive tyranny in a variety of forums. As a biographer of Jefferson and Orwell, Hitchens is uniquely qualified to address the problem of tyranny and fascism.

Semantic framing is the use of distinctive metaphorical terminology that is designed to provide a clarifying distinction with alternatives. It is the opposite of nuance in a way, and relies on positioning issues as risky (when opposed) versus beneficial (when in agreement). Interestingly, framing effects on economic decision making appear to be less effective on some people than others, with the distinguishing mental characteristic related to emotionalism (exposed as increased amygdala activity during fMRIs).

But the question that arose to me was whether we have an innate property that resists framing (and that, when we have it, drives us towards more analytical tasks and higher education levels; yes, based on my own supposition that higher education levels correspond to greater cognitive moderation) or whether it is itself a learned response to moderate one's emotional reaction to arguments and information that corresponds to the "liberal" aspects of higher education?