Thursday, April 24, 2008

Folksontamasticons and Ambiguity

Folks might not be all bad, though. For instance, in my Ofamind technology, this blog and social bookmarking sites like del.icio.us, the tags that are attached to documents serve to help people find and retrieve information. Tagging is a counterpoint to the idea of structured ontologies and metadata because it builds from the ground up rather than from the top down. The term coined for these tagging schemes is “folksonomy.”

But are folksonomies useful and consistent? Some studies suggest they are useful under some circumstances. For instance, querying across the titles and descriptions using tag keywords on del.icio.us bookmarks results in a precision-recall of only 50%. In other words, the tags are not also in the texts around 50% of the time, and so provide an additional channel of information for retrieval. People appear to think differently about tags than they do about titles and descriptions.

In terms of consistency, however, a very large number of tags are used only once or are used in differing and inconsistent ways that indicate ambiguity over multiple user subcommunities. Examples might be “architecture” used to refer to computer architecture and building design, or “camp” referring to drama or outdoor recreation.

A couple of interesting questions emerge about how to refine the power of folksonomies.

For instance, can the title and description (or full blog content) be used to automatically suggest tags that are based on other tagging schemes? The Ofamind system partially does this by automatically categorizing web content among your “views” or collections as you surf. It does a fair job, too, for a great deal of content. This can be seen as a personalized metadata tagging filter, since the view association to content is essentially a categorical tag.

Similarly, business taxonomies, controlled vocabularies, full ontologies and other mechanisms could be used at authoring time to try to suggest or overlay more consistent tags onto web content, enhancing searchability and even supporting reasoning about content. For Ofamind, a subproblem that we are currently working on is how to disambiguate extracted people, places and organizations in order to produce high-quality metadata using a combination of human tagging and automatic methods.

Then the folksonomy becomes more of a folksontamasticon, combining folksonomy, ontology and onamasticon in a rare new tag.

No comments: