Wednesday, March 28, 2007

Districts and Dali


So I'm in Washington DC for the entire week, drinking in motherhood and apple pie on topics in valuation, product positioning and the supreme importance of executive teams. I played hooky today, though, and bailed on the afternoon session of the MIT Enterprise Forum.

A quick Metro ride to the Smithsonian exit and I can safely admit to having museum fatigue and seriously painful feet.

Still, the cherry blossoms are in bloom and the weather is just spectacular, so how can I argue with my fate? I've also had some excellent meals: one-eyed tuna steak on Monday, hand-made pizza with jalapenos and anchovies last night, and Thai duck rolls for lunch.

Tomorrow is a long day of meetings, again, however, lasting into the late evening, followed by a quick exit out of Reagan National on Friday.

I should be just over my jetlag as I head back to California.

I love it when business crosses path with science. It always seems so hopeful to me.

Thursday, March 22, 2007

Sturm und Strauss

I was thinking about Jon Mendelson's post on the use of simple, divisive political messaging and the lack of nuance among some of the TV news providers. FOX News and Rush Limbaugh are the most obvious and odious, but I've seen equally stark polemics from the far left of the spectrum where the primary currency seems to be paranoid conspiracy theories and vacuous relativism. The why and how of that messaging is a fascinating topic to me, especially since the tone of those broadcasts and writings either reflects contemporary thinking or drives it.

Stephen Colbert does a brilliant job lampooning this kind of ham-fisted Manichaeanism when he asks questions like: "George W. Bush: great president or the greatest president?" Humor is a powerful force to expose the lunacy of self-fulfilling categories but mostly appeals to the politically cynical. If one only sees politics as laughably mad, then one never embraces politics and just mocks from the sidelines. Still, it adds warmth and color to the Sturm und Drang of modern American politics.

But the numbers suggest that denaturing the influence of newsertainment (just made that up) will not be easy. I constantly chat with well-meaning and nice folks who have whole quivers of talking points derived from FOX: "liberals hate America", "there is no separation of church and state", "the homosexual agenda", "culture of life", etc. Newsertainment thrives because it is mostly entertainment by eloquent and engaging personalities who package ideas as starkly as possible to make it all fun and easy to swallow.

At the same time, there may be a slightly more devilish consequence that fulfills the neoconservative Straussian notion that sometimes the body politic needs to be told noble lies for its own good. Most neocons actually deny that such elitist and Platonic notions color their agendas, but do assert a kind of triumphalism associated with freedom and democracy that is so simplistic as to effectively be an untruth. I would have loved American foreign policy to have found a way to oppose communism while not supporting dictatorial regimes back in the day, but we have the history we created and must live with it and the dark background it provides to all modern foreign policy. Wrapping foreign adventurism in triumphal declarations doesn't change the realities on the ground.

It's odd, Plato and Strauss just reminded me of Ayn Rand in their belief that categorical definitions should be shaped by hopes--celebratory--rather than mired in naturalism. Did she call it Romantic Realism? All three have a tendency towards elitism, as well, although Objectivists may quibble over that in Randian apologetics. But Rand at least would have been opposed to lies and simple dichotomies, I think, and much more inclined to challenge the common man and woman to do better at understanding the depth of issues in formulating his or her opinion. Anything else would have stunk of mental slavery.

But even if the impact of 24/7 newsertainment turns nice folks into partial zombies, it is also pervasive and invasive enough that it spreads moral empathy more rapidly than at any point in history. Abu Ghraib, murders and rapes by soldiers, torture debates, civilian casualties, the fates of dogs in Iraq are all seen, projected and packaged to appeal to our moral outrage. When we see horror, we agree that our men and women should not be doing it. We have a need to see ourselves as civilized and fair in the face of horror and voices like Ann Coulter become increasingly marginalized as they struggle to convince us otherwise. And that aspect of newsertainment doesn't bother me so much, though I still prefer McNeil-Lehrer.

Monday, March 19, 2007

Sudoku and Universal Selection

Anticipating the future and managing risk is an essential aspect of human endeavors and is a core activity for biological organisms. Internally, the same dilemma arises with immune systems trying to manage the constant flux of parasitic organisms that strive to use our energy for personal gain. Because the threats are constant and changing, an organism needs a universal toolkit for solving problems in both the external and internal environments.

I had never done a Sudoku puzzle before about two years back, and only recently tried them in earnest. They are an interesting mental exercise that combines well with other activities like watching TV (an activity I also recently succumbed to after getting an HD DVR). My Sudoku strategies that I have developed involve several levels of activities. First, an almost gestalt-like visual scanning for same number distributions in vertical and horizontal patterns. Next, each of these is checked with greater attention to look for possible fill-ins of missing information. Vertical and horizontal fill-ins are a next step, along with block fill-ins. But then something interesting happens with hard or very hard Sudoku: they become unsolvable without guessing. Guessing is not actually needed, really, since one could just go through every possible combination, looking for inconsistencies down the search tree of available patterns, but only computers have the ability to do that effectively. Instead, I look for possitions that might yield progress and have only two disputed positions (three sometimes for mega hard puzzles), and guess. Then I try to carry through the implications of that guess and check for failures. I write the guesses in the upper right-hand corner of the boxes and circle the initial guess. I then carry forward the implied results, also writing them in the upper right of the boxes. Sometimes this goes two-deep, with the need for a second round of guessing, which I write in the upper left corner. If an inconsistency emerges, I can erase the set of guesses back to the initial two and rearrange and try again. This is essentially depth first search but using a semi-random initial selection. The actually fill-ins are truly random, but the choice of location to try is based on vague ideas of coverage (how many rows, columns or blocks can I complete or progress on by trying these values?)

Now, the ability to complete Sudoku doesn't seem to have any particular survival value relative to primitive human survival. It doesn't help throw rocks at game or understand the lay of the land. So it must be a side-effect of a generalized ability to solve problems that utilizes search, elimination and selectionist principles. The ability to think about the implications of any move only goes about two configurations deep for me without the extended notation of writing them down. I'm sure others can do better, but that doesn't change the basic mental activities of randomness combined with search.

A variation on this theme struck me years back when reading Gary Cziko's excellent Without Miracles: Universal Selection Theory and the Second Darwinian Revolution. I actually used a similar type of word problem to argue that universal selection ideas added very little to the explanation of the problem solution because the random component could be replaced with brute force search without any change in the efficiency of the search strategy (other than the gestalt initial choice). I think that misses the point, however, in that I really do make a choice randomly about the assignment, whereas a computer search algorithm would make that choice according to whatever programming it had been given (assign the numbers to the squares lowest to highest; assign the numbers randomly; assign the numbers according to a reading of the I Ching). My programming, if you will, has me choose randomness as the preferred strategy because I see the other strategies as irrelevant to solving the problem and therefore use the easiest method of guessing. When uncertain, guess the answer! I seem to remember this strategy in differential equations: look at the form of the problem, guess the answer, work out the constant.

Wednesday, March 14, 2007

Parker and Principles

My local paper recommended a reading of Parker v. District of Columbia to get an understanding of this controversial decision that will likely head to the Supreme Court. Of course, I took their advice.

I have argued previously in an extended editorial to the same paper that American morality is not derived from religious sources but is driven by the interplay of largely secular notions of principles and laws. The former are partly indescribable and governed by changing semantics over time: freedom, liberty, fairness. All of these are mostly unrelated to religious convictions except in very strained cherry picking of religious texts. The Abrahamic religions are notorious in their historically xenophobic tribalism, for instance, though Confucianism provides a more direct effort to provide a philosophical construction of government and governance. So exegisis, omission and liberal readings become the critical analytical tools to forming justifications for morality when using ancient religious texts as a basis.

But language is imperfect and something seemingly as simple as determining the meaning of Article II of the US Bill of Rights takes on some of the same character, though with a remarkably analytical focus. Parker v. DC is one of those cases. Here's the text of Article II:

"A well regulated Militia, being necessary to the security of a free State, the right of the people to keep and bear Arms, shall not be infringed."
In Parker v. DC, the meaning of Militia, State, and Arms are all debated, invoking common parlance and historical context. DC even claims (and is supported by the dissent) that DC is not a "State" under the terms of the Constitution and that the rights and liberties guaranteed therein don't really apply to territories or special federal districts. Another aspect of the analysis of both positions is whether the prefatory clause "A well regulated Militia" constrains the remaining language. I'm amused by both sides on this, most especially because I once argued (derived from Justice Stevens, I think), that in the 1st Amendment, the establishment clause takes precedence over the free exercise clause because it arrives earlier in the sentence.

Minutiae is minutiae no matter which side you come down on, but it is sad that the Bill of Rights is hostage to the embalmed critical analysis of highly ambiguous language in short pithy statements. I tend to support readings of the Constitution that provide for the freedom principle first and foremost, but that then supports the fairness principle in that there are no additional rights extended by implication. So I agree with the majority opinion that the individual's right to own firearms was intended (or ought to have been intended in any case) by the Framers. But that right can be regulated under some circumstances, with the decision criteria built on a fairness principle (does the regulation provide for the fair access to said firearms for the historically-motivated purposes of self-defense, hunting and providing a militia?) This is identical to my position on 1st Amendment religion tests: freedom of exercise is not encroached by legislation that fairly (was not formulated to restrict a religious group from their practice) serves a criminal legal function (protecting the right of Laotians to club dogs to death for good luck does not override our right to legislate against animal cruelty). This latter position actually flies in the face of recent 1st Supreme Court decisions, but they were, in that case, operating not just on the issue of the 1st, but on additional legislation that modified the interpretation of the 1st to, in my opinion, distort the fairness principle.

But the broader question for me is whether it is possible to reduce the interpretive conundrums that have resulted in these snarled roots? Certainly it would have better served our present purposes to have had the 2nd written like:

"An individual has the right to keep and use light firearms like rifles and pistols, for hunting and to protect themselves, their homes and property from violation and tyranny. Congress and the States shall have the power to create and regulate militias, to regulate hunting, and shall have the power to limit access to, possession of and use of other dangerous weapons and devices that constitute a threat to the common good."
We can't anticipate how a reading of this will change in another 200 years, but we could be better suited by it for the next bit--say, 50 years. Anyone up for an Amendment?

Monday, March 12, 2007

Categories and Biases

Hot on the heels of the "Darwin's God" extended article (hrmmph, only available via Times Select now), New York Times scores again with a great article on neuroimaging, ethics, law and punishment (here).

I was struck by the discussion of the Implicit Association Test (IAT) being used to try to tease out how racism operates psychologically, especially when combined with startle response tests. The findings show that whites shown unfamiliar black faces have higher flight-or-flight if they also show evidence of bias on the IAT. The bias disappears for familiar black faces (MLK Jr., etc.), however. These biases work the other way, too, with African Americans showing bias towards white faces. The focus of the article is on whether that racial bias may be used in the future to change the way juries are selected, but I think it points more towards an understanding of how our brains are pre-wired to form generalizations and stereotypes as placeholders in our mental models. For white kids from the suburbs who have or have had limited contact with African Americans, for instance, there is a default xenophobia and nervous response to black faces.

An anecdote from the summer of 1998 when I was in Hiroshima, Japan: I was there for almost a month working with a professor at a local university on a machine translation project. Almost every morning at dawn, I would go running through the calm streets, down along the canals, then back to the city center to my hotel. I found very quickly that in the canal area, elders would gather for Tai Chi as the sun came up and, when a 6'2" Anglo man came running around a corner, would panic, stop in their tracks and sometimes emit little shrieks of fear. I learned to stick to the main drags and avoid those exercise zones.

Faces are very significant again, because face detection and recognition plays such a critical role, but the article does show that there is plasticity in the xenophobia--the categories aren't fixed nor is the level of response to them. If Denzel Washington doesn't elicit a strong response, it means that the category formation apparatus is biological, but the categories are learned through exposure or lack of exposure. If we are hard-wired to identify and fear groups of "others", it is also possible to displace or override that categorization mechanism by not finding any groups to assign to those categories.

The acquisition of irregular verbs during language learning has some similar properties. Irregular verbs are like run, running, ran. The past tense is not formed by the common suffix rule "add -ed" in this case, making it irregular. For children, irregular verb learning has a startling "U" shaped curve. On first exposure to an irregular verb, they just rote memorize the past tense and then use it correctly for a while. Then, over time, their performance tends to degrade as they "discover" rules like "-ed" and start to overapply them. Finally, they learn the limits of productive morphological rules and become very capable in the combination of irregular forms and regular rules. Thus, a "U" shaped curve when measuring the correct use of verbs, with initial high quality, followed by a period of incorrect usages, then finally correct use once again.

If category formation for group psychology, biases and xenophobia is largely similar to the example of verb acquisition, we should see a reduction in those biases over time with exposure to positive or non-threatening exemplars. This may be the lasting legacy of school integration and busing--more so than overcoming deficits in the capabilities of minority schools--in reducing the immediate and fearful association of skin color with otherness. Similar effects apply to the social acceptance of woman's roles and homosexuality, both of which have become gradually more accepted over time as exemplars have become increasingly common due to television.

Thursday, March 8, 2007

Selective Memory and Chunking

A problem that I have worked on for around 15 years involves trying to tease apart how to create optimal mental models. Now, when I say optimal I mean models that are in some sense the best at predicting the class of a signal given a learned history of signals. The best in some sense may not, however, be perfect because there may not have been enough information to formulate a perfect model that is infallible, especially when some of signals may have been underrepresented in the learned history.

My interest in this topic arises from my interest in evolutionary optimization. The generalized intelligence of human beings is both an incremental improvement over other animals and a remarkably new phenomenon, especially when combined with language and cultural transmission of lessons learned. I assume our model-making capabilities are highly optimized to use information effectively, although there are clear biases that influence the default modes that our mental models grow and change. Examples of these biases arise all the time, like in our highly productive attempts to find human faces and forms in shadows. This illustrates a strong bias that likely had evolutionary significance in helping us escape threats; it's better to run away or be prepared for a fight than to rule out a few false positives on the threat radar. The worst you experience is mild embarrassment compared with death.

The problem of optimal learning of models is nevertheless interesting because it points a way towards understanding how information can be used for prediction even if biases are mixed in due to extrinsic factors like threats, and the topic I have studied involves language learning. Here's the interesting tidbit that led me to investigate this topic: if you create a sequence of letters that look like a cryptoquip but are created by a simple set of probabilistic equations, people can study them and then tell whether other, similar sequences came from the same equations or not. In other words, we can learn arbitrary patterns of symbols that are essentially meaningless. How is this interesting? Well, sound sequences for young children have only a bit more context than that, a bit of pointing and smiling and frowning and what not. So it is safe to assume that a general capability for learning patterns is likely related to the same machinery that children use during language acquisition. Moreover, it seems likely that it is related to general pattern learning processes.

In the original experiment, the question that arose was whether the people were learning based only on repetitious pairs of symbols or whether larger "chunks" of symbols figured into their reasoning. The experiment was repeated with a careful balancing of "pairwise frequencies" to demonstrate that larger chunks must be involved in the learning process. So we appear to create a mental model that involves inferring that some chunks are useful but others are not, and then comparing those chunks to new sequences to see whether the new sequence is a good match or not. But which chunks do we choose? And is there an optimal way to acquire chunks such that we predict well?

For the second question, we can see the bounds of the problem if we acquired every chunk of every length and held them in our mind to compare with the new sequence. We could then count the number of chunk matches and called them a match if we had enough. But what is that threshold to make a judgment? And, besides, we don't have enough short term memory to keep all those chunks in mind, since these are arbitrary symbol sequences like XVMTMVXQT... So maybe we look for large chunks that have similar patterns in different positions, or symmetries, and hold only those chunks in our mind? That seems more likely, but we know that it can't be merely due to pairwise symbol frequencies, since people still can perform the task when those frequencies are balanced.

My solution, which was partially derivative of another model used for phrase acquisition, is to create a "lexicon" of symbol pairs initially, then create a new level by combining pairs together (or pairs with singletons), but then throwing away combinations that do not increase the predictive capability of the system as a whole. This criteria, known variously as Minimum Description Length or Minimum Message Length, considers the expense of adding elements to the lexicon as trading off with the predictive value of the element. A combined element may be slightly useful, but it also may cost too much to continue to represent it in the lexicon, and therefore should be thrown away. Using this approach, some notion of optimal prediction can be achieved given a finite set of learning examples.

But here is the rub. Using a bottom-up combination approach means that it is possible that you won't ever find the best combination of chunks because the chunks repeat with a certain distance between them, and the symbols in that separating area vary. Combination doesn't suffice under those terms, so you need the ability to semi-randomly and experimentally vary elements of the lexicon and check whether those mini-experiments improve the model. This is where we can propose a selectionist description for optimization that builds on the basic combination approach by adding a generate-and-test style approach to thinking! The end result is a greater likelihood that the resulting model will outperform other models, while requiring limited additional commitment of working memory.

Monday, March 5, 2007

Teleonomy and the Why of the Y

I was reading Dan Dennett's Breaking the Spell: Religion as a Natural Phenomenon this weekend and came across his use of the phrase free-floating rationale to describe the automatic discovery of solutions to fitness problems using evolutionary search. I think Norbert Wiener coined the term "teleonomy" to describe the same phenomenon. Dennett is very much up on the topic, so I'm surprised that he didn't cross-reference Wiener's rather nice term?

In related matters, Dennett references the notion that parasitism may be the driver for sex, originating with John Maynard Smith in "parental investment theory." The problem is that mixing one's genes together with another's needs explanation: why is sexual recombination apparently better than asexual? After all, in the latter the genes are faithfully copied without the threat of being coupled with some broken alleles that could lead to the end of the genetic lineage. The genetic mixing might be most important for immune response in (part of) Smith's reasoning, and the variability of immune response may be adequate to offset the costs of mixing.

Interesting, but there is another oddity. In mammalian males, our 23rd chromosome doesn't have backups for broken alleles (the "Y" chromosome). It turns out that my foster father asked the "why" question in the early 90s concerning Y chromosomes. The Why of the Y? Without a backup, genetic defects will build-up faster in males than in females. These "sex linked defects" reduce the robustness of males in comparison to females and, when combined with agressiveness and the physiological consequences of the hormones that drive aggressiveness, result in short, angry lives. So my foster father reasoned that the Y is a mechanism for increasing the exposure of defects under competition, driving deleterious mutations out of the population at a faster rate than if males did not have a Y. This is somewhat controversial, to say the least, because it suggests a meta-evolutionary or group selection-like mechanism, but it does provide a Why for the Y where few have asked the question before.