Friday, May 23, 2008

Assault and Ambiguity

It began with a chance encounter. I was walking through a room with a TV on and a news caption was running during a commercial break. The newscaster intoned with provocative seriousness how a woman had followed a man who had sexually assaulted her weeks earlier to his home and he had been arrested. The location was my town and it rang a bell.

Three weeks back my wife and I went to a doctor’s appointment in Lafayette and, as we returned and entered our neighborhood, we were surprised to see a half-dozen police cars in our otherwise perfectly tame patchwork of planned homes and parks. During a walk two hours later there was still a squad car on one side street. A scan of the police blotter turned up the cause: a woman walking with her toddler had been assaulted and groped by a man who ran away when struck in the groin with a sippy cup.

Weeks passed until I overheard the news of the arrest. Good for her! The next phase of datamining impressed me with the thoroughness of the picture. I was able to use the television station archives combined with Google to find the mug shots, the original sketch of the suspect done by police sketch artists, the suspect’s arrest status in the county courts system, the location of the alleged perpetrator’s house, the suspect’s father’s name and place of business, the suspect’s mother’s name and place of business, a previous citation of the suspect for a moving violation (infraction) in a neighboring city, the county records concerning the amount and type of mortgage held on the suspect’s home, and a satellite view of the home as well.

Amusingly, also, was that the reporter in the news piece actually drove by our house and coincidently filmed our various vehicles. I could likely have read the license plates if I wanted from the footage.

Overall, I had managed to scour out all the corners of ambiguity concerning when, where, who and how, leaving only the strange question of why left in fuzziness. Why was this 24-year-old still living at home, jogging at midday and preying on middle-aged women? Why was he living in this neighborhood where even a Megan’s Law offender is fairly hard to find?

But strangely, it was the suspect’s last name that was the key to developing the search picture because the last name was so unusual. Had he been “Jim Smith” or “Joe Sanchez” or “Mike White” it would have been virtually impossible to make as much headway in extinguishing ambiguity. George Miller is quoted something like “There is only one problem in Artificial Intelligence: words have more than one meaning.” (And I can’t resolve the ambiguity of the source of that quote because George Miller is too ambiguous). This problem is amplified for searching across identities of places and people, or when special identifiers are introduced as placeholders in a single document (this happens quite often in technical literature where an acronym is used locally as a technical shorthand but is ambiguous outside that document or domain). Moving to the level of folksonomies for, say, labeling pictures on the web, we see the problem exasperated by the natural telegraphic shorthand that any labeling scheme suggests to the user purely by dint of the size of the entry fields.

Clever approaches to trying to apply context to help address these limitations start with statistical co-occurrence-based disambiguation and linkage analysis, and then run all the way through to using complex ontologies to try to infer the best relabeling of the ambiguous entity or concept as a canonical identifier. None of these methods can hope to achieve any level of perfection but a basket of them can enhance the process of information discovery and disambiguation.

Thursday, May 8, 2008

Race and Empathy

But back to ethics. If there is any validity to the idea that modern values have been shaped by increased exposure to images and video of other people, thus removing the barrier of unfamiliarity that allows us to treat others instrumentally, then we should see the vestiges of that unfamiliarity quotient in various social problem areas. Indeed, we do:

American Journal Psychiatry 165: 560-561, May 2008

Recognizing Each Other and the Effects of Racial Differences

Carl C. Bell, M.D.

In this issue of the Journal, Pinkham et al. report findings on own-race and other-race effects in their research on facial recognition in schizophrenia. They highlight the troubling finding in previous studies that African Americans with schizophrenia had greater impairment in recognizing and remembering emotions in faces, compared with Caucasian schizophrenia patients. The authors point out that these previous studies showed only Caucasian faces as test stimuli—a major procedural flaw that no one had anticipated. Pinkham et al. designed a better study in which test subjects—both Caucasians and African Americans with schizophrenia as well as comparison subjects of both races—were shown both Caucasian and African American faces as test stimuli. The authors found that the capacity to recognize and remember emotions in faces is no different between Caucasians and African Americans, whether they have schizophrenia or not. But memory for the faces and discrimination of emotions in them are higher when the study subject is of the same race as the person expressing the facial emotion.
Some interesting follow-on questions for investigation: Would exposure to multi-ethnic faces early enough reduce the emotion discrimination deficit? Is there a learning window? Are there differences between subgroups (more integrated communities show lower deficits)? What is the impact on eyewitness subject identification across racial lines?

Saturday, April 26, 2008

Ofamind and Twine

I was flattered to learn about twine.com the other day. Twine is the long-simmering product of Radar Networks and is remarkably similar to Ofamind in many ways. A bit of digging shows that they used Wikipedia as a basis for their tagging engine. Some reviews have been positive while others less so.

Still, I consider it a vindication of sorts, though I think they will have a hard road business-wise if they believe twine.com has a massive audience appeal. I will just mention that I interviewed with a startup called Backflip in 2000 that went nowhere as well. The business proposition for these kinds of technologies is when they provide real business value to specific knowledge worker communities--when they serve vertical domains to address specific problems.

A minor footnote: I chatted with Radar Network's Nova Spivak several times about collaboration several years back and suspect that there was a bit of, ehhhh, influence of Ofamind on Twine's development (based also on watching my weblogs). I'm just amused that I beat them to the plate with one consultant and three other ongoing engagements simultaneously. Are there diminishing returns beyond a few agile minds in what Bessemer VCs refer to as our new capital efficient web marketplace?

Thursday, April 24, 2008

Folksontamasticons and Ambiguity

Folks might not be all bad, though. For instance, in my Ofamind technology, this blog and social bookmarking sites like del.icio.us, the tags that are attached to documents serve to help people find and retrieve information. Tagging is a counterpoint to the idea of structured ontologies and metadata because it builds from the ground up rather than from the top down. The term coined for these tagging schemes is “folksonomy.”

But are folksonomies useful and consistent? Some studies suggest they are useful under some circumstances. For instance, querying across the titles and descriptions using tag keywords on del.icio.us bookmarks results in a precision-recall of only 50%. In other words, the tags are not also in the texts around 50% of the time, and so provide an additional channel of information for retrieval. People appear to think differently about tags than they do about titles and descriptions.

In terms of consistency, however, a very large number of tags are used only once or are used in differing and inconsistent ways that indicate ambiguity over multiple user subcommunities. Examples might be “architecture” used to refer to computer architecture and building design, or “camp” referring to drama or outdoor recreation.

A couple of interesting questions emerge about how to refine the power of folksonomies.

For instance, can the title and description (or full blog content) be used to automatically suggest tags that are based on other tagging schemes? The Ofamind system partially does this by automatically categorizing web content among your “views” or collections as you surf. It does a fair job, too, for a great deal of content. This can be seen as a personalized metadata tagging filter, since the view association to content is essentially a categorical tag.

Similarly, business taxonomies, controlled vocabularies, full ontologies and other mechanisms could be used at authoring time to try to suggest or overlay more consistent tags onto web content, enhancing searchability and even supporting reasoning about content. For Ofamind, a subproblem that we are currently working on is how to disambiguate extracted people, places and organizations in order to produce high-quality metadata using a combination of human tagging and automatic methods.

Then the folksonomy becomes more of a folksontamasticon, combining folksonomy, ontology and onamasticon in a rare new tag.

Thursday, April 3, 2008

Folk and Psychology

Is the reach of reason limited by our nature? That question came to me during a rather disagreeable conversation recently. The woman I was conversing with spoke almost entirely in vignettes built around folk sociology and psychology.

“Girls always grow up to marry someone like their father.” “You just don’t understand what it’s like to grow up as a redhead.” “My father hates me because I had medical problems as a child.”

Now, I try to be sympathetic of even bad self-analysis insofar as I only have to encounter it in small doses. What good can come from challenging people under those circumstances? But I increasingly wondered as the conversation continued whether there are limits to our natural capacity to overcome the patterns of folk psychology we use to attach meaning and explanation to our lives. I tried lightly challenging one of the statements and suggested that there was little or no evidence to support a given claim. Her response was that she just gave me evidence—a single example. There was some hemming and hawing about how she acknowledged that that wasn’t scientific evidence, but it was enough for her.

Isabel Allende was recently interviewed and she confessed that her entire writing method and inspiration emerged from trying to construct narratives and folk psychologies to explain her characters. Everything had an explanation and there was little difference for her between magical, religious and everyday occurrences—they all had elaborate explanatory narratives that involved mystical forces, and frameworks for punishment, reward and retribution. Magical Realist at core, but also reflecting the need for fiction to tie together into a structural form that is without the weak sense of doubt that pervades our everyday lives; people are complex and do things for complex and sometimes unexplainable reasons best regarded as tendencies.

But is there any sense in which people can change their cognitive styles? Somewhat, I think, but there are also other factors like dopamine and it’s relationship to magical thinking that are likely more resistant to active attempts at change. Still, the goal of liberal education has always had at its core the notion of refining the mind to enhance our ability to think and process information. It’s the best tool we have.

Tuesday, March 25, 2008

Flaws and Adams

Drawn-in as I am by HBO's John Adams miniseries, and by the portrayal of Adams as a flawed but principled man, I was equally impressed by the quotes that put him in context as a man of his times conflicted with the biases of his era (at The New Republic):
In his summary defense in the Boston Massacre trial, he claimed that the British soldiers had every reason to be afraid of the crowd, "a motley rabble of saucy boys, Negroes and mulattoes, Irish teagues [pigs], and outlandish Jack Tars."
And here:
In fact, one of the causes of the revolution was the Quebec Act, which gave religious protections to Catholics in Canada. This infuriated the colonists. "Does not your blood run cold to think that an English Parliament should pass an Act for the establishment of arbitrary power and Popery in such an extensive country?" wrote Alexander Hamilton. "Your loves, your property, your religion are all at stake." Sam Adams told a group of Mohawk Indians that the law would mean that "some of your children may be induced instead of worshipping the only true God, to pay his dues to images made with their own hands." Fortunately, George Washington realized that it would undermine the colonists' efforts to win support from Canada and France if they were perceived as being anti-Catholic, so he banned the "monstrous" practice of burning effigies of the pope on "Pope Day."

Thursday, March 6, 2008

Social Cohesion and Freedom

Inchoate is perhaps the best word to describe my sustained interest in the notion that somehow we can characterize the complexity of interactions using a standardized grammar or toolkit. Sometimes small pebbles of coherence emerge from this interest, like work on characterizing the complexity of grammars for generating neural networks or interesting music production systems (after great effort, moderate diversity and connectedness is not surprisingly a requirement for both of them!)

Still, I remain a student of the general theme and so am intrigued when people like David Sloan Wilson characterize the role of religion in social cohesion as providing unique evolutionary advantage at the group level (Darwin’s Cathedral: Evolution, Religion and the Nature of Society). While arguing about group dynamics in evolutionary circles is somewhat heretical, Wilson paints a picture that once again uses language like diversity, prediction, connectivity and social support.

From some cross-over of libertarian and paleo-conservative thinking, comes another intriguing data point from The American Conservative. Yes, that’s Pat Buchanan’s magazine. Pat regularly brings up the notion of “balkanization” both in reference to the Balkans as an example of a geopolitical mess, and as a broader metaphor for the problem of diversity in modern societies, so it is not surprising to see his magazine latching onto Robert Putnam’s discourse on changing American civic involvement (Bowling Alone) and related research on the potential drivers for civic strife.

The TAC article by Steve Sailer is somewhat fragmented, jumping around through some sloppy generalizations about ethnic identities (Hispanics and Italians don’t build large organizations because they only trust extended families, for instance), and dipping into wag-the-dog-style political fervors driven by a common enemy. It ends with some minor discussion of how both religion and mandates can improve cooperation between people, with the latter example being the Army limiting career advancement among officers who discriminate.

Now I suspect Sailer and Buchanan consider this grist for a policy mill that aims to reduce immigration to the US (or perhaps be simply more selective about it), but in some ways it works against their more cherished cause of small government and limited government because, given an uncooperative, pluralistic and diverse population, one remaining channel to achieve grander visions is through government action. Government and law become the conduits for coordination by transforming distrust in others into (perhaps grudging) acceptance of institutions. Even Sailer admits as much in noting that:

In America, you don’t need to belong to a family-based mafia for protection because the state will enforce your contracts with some degree of equality before the law.

So I suspect that we have achieved a stalemate of sorts, with the benefits of diversity (I’ll just start with my restaurant options today and leave it at that…) balanced against less social cohesion, but perhaps propped up by institutions that are trusted enough that we are not always suspicious of corrupt abuses of power. That seems like a gentle enough substitution for a civil religion to me, with a more subtle organizing physics that preserves the freedom to think outside the confines of any monolithic pattern of ideation.