Archive for June, 2011

The Foundations of Measurement

June 30, 2011

There was a time in the 1960’s and 70’s when mathematical psychology was considered an important field. One of the defining tomes produced in that era is the three volume Foundations of Measurement by Krantz, Suppes, Luce and Tversky. Measurement is a particularly tricky problem in the study of the mind. There are two sub-questions here:

  1. What are we (i.e., the scientist studying human or animal psychology) measuring?
  2. What is the mind of the human or animal taking as its starting point in its engagement with the world?


The latter question is the more interesting question for us. For example, most studies of vision assume that the starting point of vision is an array of pixels on a retina or a video camera. Once you take pixels as the starting point, it is clear that tremendous amount of processing is necessary before interesting outputs, such as 3D shape, can be recovered from the retinal inputs. At the other extreme lies Gibson’s theory of direct perception. Gibson argues that the ambient optic array is directly sampled by the moving animal or human being. In other words, what we “measure” gives us all the information about the structure of objects and surfaces in the world.  The underlying theory of measurement has a direct impact on our theories of sensory and conceptual systems. To the extent we have an impoverished view of measurement, we have an impoverished view of the mind.


The Foundations of Measurement

June 30, 2011

There was a time in the 1960’s and 70’s when mathematical psychology was considered an important field. One of the defining tomes produced in that era is the three volume Foundations of Measurement by Krantz, Suppes, Luce and Tversky. Measurement is a particularly tricky problem in the study of the mind. There are two sub-questions here:

  1. What are we (i.e., the scientist studying human or animal psychology) measuring?
  2. What is the mind of the human or animal taking as its starting point in its engagement with the world?


The latter question is the more interesting question for us. For example, most studies of vision assume that the starting point of vision is an array of pixels on a retina or a video camera. Once you take pixels as the starting point, it is clear that tremendous amount of processing is necessary before interesting outputs, such as 3D shape, can be recovered from the retinal inputs. At the other extreme lies Gibson’s theory of direct perception. Gibson argues that the ambient optic array is directly sampled by the moving animal or human being. In other words, what we “measure” gives us all the information about the structure of objects and surfaces in the world.  The underlying theory of measurement has a direct impact on our theories of sensory and conceptual systems. To the extent we have an impoverished view of measurement, we have an impoverished view of the mind.

Understanding Regularities 7: Organized Organisms

June 29, 2011

Organisms are both physical as well as experiential beings and it always useful to be accurate about when we are talking about one or the other. For example, it is well known that we have a blind spot in our visual field, where the optic nerve meets the retina. As far as the physical being goes, the blind spot is an area where there are no photoreceptors. However, the experiential being does not experience the blind spot as a hole in its visual field. It is not as if we walk around with a little black circle in front of us where we cannot see anything.

More generally, consider this difference between us and computers (among many other differences): while a computer will occasionally tell you that it cannot open a file or give some other error message, our sensory and conceptual systems do not give us error messages. We experience illusions but we do not experience errors. As the famous Indian philosophical example shows, sometimes we mistakenly identify ropes as snakes. But we never experience ropes as holes in the visual field. Even lesion patients do not experience the world that way; they might have hemispatial neglect and not see anything in the left hand side of their visual field, but they do not experience the visual world as half complete.

In other words, an exceptionally strong regularity in our engagement with the world is the active organization of the world into a coherent whole, in which we are coherent individual selves. One can think of the organization of the world as a “ground regularity,” which makes “figural regularities” such as stereoscopic depth possible. Even when we are unable to perceive the figural regularity we are still operating under the influence of the ground regularity.  In fact, one could go as far as to say that the ground regularity, of which the coherent body is an important part, is the precondition for any perception or cognition of the world.

Understanding Regularities 6: Controlled Descent

June 28, 2011

Regularities are not permanent markers of the world; often they are a ledge to rest before climbing onwards. To take the most obvious example, walking is nothing other than controlled falling. Each time you lift one foot, you are falling forward except that you land the other foot before you keel over. Balancing on one foot is an example of a metastable state, i.e., a state that is locally stable, but that will descend into a lower energy state if perturbed slightly. At each stage, an organism needs to point out the next metastable state before pushing toward it. The continuous dance of pointing and pushing defines dynamic regularities.

It takes a lot of effort to keep a metastable state in a position of dynamic equilibrium. Think of balancing a cricket stump or a badminton racquet on one finger: you have to keep moving your finger to make sure that the stump doesn’t fall down. Each metastable state defines a dynamic regularity; there is nothing permanent in the world that is available to “pick up,” so to speak, but at any given moment, there is a next step to be taken. While metastability is obviously true of walking, it is less obviously true of seeing and thinking as well. When you see an object, say, the famous Mona Lisa, your visual system picks up a temporary metastable regularity, which is a visual snapshot of the famous painting.

However, you might want to know more; perhaps her smile is so mysterious that you want a closer look.

That movement toward the painting destroys the earlier metastable percept and introduces a new one. The smile remains mysterious though; the old lady never reveals all her secrets,

Pointing versus Pushing

June 25, 2011

Every corporeal being is bound to classify the world into two extremely basic categories:

  1. That which can be grabbed (or grabbed by)
  2. That which cannot be reached.

More generally, for each sense, we classify the world into

  • That which is immediately available to that sense.
  • That which needs to be indexed into, in order to be available for that sense.

Indexing can take various forms, from body-muscle preparedness to eye-saccades to visual navigation. For every sense, we can make the following classification:

  1. An ”actual” object (or object part) of that sense into which we have indexed, and which is available for further elaboration or manipulation. For example, having indexed into Leonardo’s Mona Lisa, we can move closer to discern whether Mona Lisa is smiling or not.
  2. A ”potential” object (or object part) of that sense that will be made available to us with an appropriate amount of effort on our part.

Note that these are phenomenological distinctions; I am not talking about subconscious or unconscious representations in V1 or some other brain area. In our experience of the world, there is a basic division between those things that are immediately available using vision, hearing, touch etc and those that require effort. What is available transparently to one sense might require effort from another – consider the shape of a soccer ball from vision and touch. In any case, the sensory world can be divided into those entities with whom we are in direct contact, and those with whom contact requires effort.  We can think of the spatial world in terms of a figure-ground analogy: one the one hand, as Kant pointed out, space is a basic category, it is presupposed in our understanding of anything else. On the other hand, we process detailed spatial information (where objects are, how to catch this baseball etc). The first can be seen as the structuring aspect of space, while the second as consisting of detailed perceptual or encyclopedic information.

We can call this the pointing body versus the pushing body. The pointing body allows us to index into locations (there), objects (that!) and so on. The pushing body helps us interact with those entities that we have pointed to, but these seem to be two very distinct modes of bodily being.

The Varieties of Representational Forms

June 23, 2011

The traditional approach to modeling in the cognitive sciences starts with the identification of the problem – let us say, causal explanation patterns – and then moves on to a a representational framework in which that problem can be addressed (let us say Bayesian probabilities) followed by theories that seek to explain experimental data (say children’s use of explanations in the learning of concepts). This method has some clear advantages – it clearly circumscribes the problem within a well defined mathematical framework and makes predictions about behavior on the basis of that model.


However, such an approach does not work well with the fact that we use different representations for different aspects of the same cognitive act; further, the use of different representations is much more important for a living breathing being than for an abstract account of the task – after all roman numerals and the modern decimal system both encode natural numbers, but the latter allows to compute with greater ease. It makes sense to use multiple modes of representing; different representations make different aspects of the world transparent. If one thinks of cognition as a tool that makes different aspects of the world ‘available’ to the cognizer, then each representation has its own ‘availabilities.’


Modelers should address the possibility that the human mind flexibly uses different kinds of representations depending on the task and that the same cognitive phenomena should be modeled using different representational forms. The key questions that arise are the following:


  1. What is made transparent by a given representational form?

  2. How is this representational form indexed in a given task?

  3. How are different representational forms combined in a given cognitive state/act?


We should bring as many different representational frameworks together in the study of the above questions. Some of the most important representational forms are: Logic, Probabilities, Games, Laws, Categories and Stories. We should take a few ‘core’ test cases (such as the Trolley problem) and show what is made transparent by each representational form in each test case and then ask how the similarities and differences between the various accounts can be integrated.



Enhanced by Zemanta

A Typology of Beliefs

June 22, 2011

1. Introduction. The goal of this essay is to analyze the cognitive structure of beliefs. While beliefs vary tremendously, from sacred beliefs that are codified in texts to scientific hypotheses about the cosmos, I want to understand the structure of beliefs as encoded in the common sense of various human cultures. For that purpose, I focus on the kind of everyday suppositions, inferences, judgments and expectations that transpire during the process of living a life in a given social context. These beliefs are often tacit and effortless, without any overt reasoning or reflective analysis.


Suppose you are leaving home for work and while walking to the subway station, you notice that everyone is carrying an umbrella. You might want to head back home and get your own umbrella. Here, the belief that rain might be in the offing dictates your actions via an unconscious inference about the link between umbrellas and rain. Beliefs can also guide action without any mediation from inference. When your mother comes and tells you that your friend is at the door, you believe her and walk to the front of the house. There is no need for inference; your trust in your mother’s testimony is enough. Similarly, consider the following two sentences that might equally well inform our beliefs about John being late:


  1. While John was driving to his house, he remembered that he was supposed to be at a meeting and he immediately turned around and drove back to his office.

  2. While John was driving to his house he got stuck in a traffic jam behind an old red Honda.


The first statement has an inferential form of the type “I should be at work, but I am heading towards my house, therefore I should drive back,” but what about the second? It only has a temporal order, and there is no sense in which “there is an old Honda” follows logically from “John was caught in a traffic jam”. Yet, most of our oral and written communication (and spoken and written language informs most of our beliefs) is of this form. Importantly, while the second sentence does not have an inferential structure, it does have a recognizable narrative structure and sounds plausible enough to our ears, unlike the third sentence below.


  1. While John was driving to his house, he turned into a dragon and ate a couple of pedestrians.

What makes the first two beliefs acceptable in ordinary discourse, while the third isn’t? I believe that the answer to that question has to do with the cognitive structure of beliefs. This essay is an attempt to develop a theoretical understanding of the cognitive structure of beliefs in the form of a typology of beliefs, which I believe to be a first step towards a far reaching analysis of beliefs. An adequate typology is a precondition for any explanatory account, for it delineates the phenomena that need modelling. This essay bases its typology of beliefs on a series of interrelated claims:


  1. The production and comprehension of beliefs lies at the foundation of cognition.

  2. The structure of beliefs has two aspects: a ‘production structure’ that dictates which beliefs that are generated and a ‘comprehension structure’ that dictates which beliefs are understood. Both the production and comprehension structures are further sub-dividable.

  3. The logical/inductive structure of a belief (if there is one) is a subset of the structure of production and comprehension of a belief. In particular, the production structure consists of narrative patterns more general than the inferential patterns normally studied.

  4. Further, and most importantly, beliefs are fundamentally tied to a social context, with tacit assumptions about the size of the community to which the belief is addressed.


Each one of these factors – potential narratives, the social community, the degree of acceptability, the perceived value – is fluid and changes from context to context. The religious fundamentalist who insists on a strict interpretation of a sacred text and enforces social relations consistent with that interpretation, is often more than happy to call his friends on their cell phone to tell them about his understanding of the same text. So when is a belief change acceptable and when is it not? Some new beliefs and actions are so egregious that the originator is rebuked, ostracized or worse and others are embraced as the next new thing. Indeed, beliefs wouldn’t change if someone didn’t come along and say something different, but, as we all know, some changes are more acceptable than others. Is it possible to study beliefs theoretically and computationally in a manner that’s sensitive to the circumstances in which beliefs (or as is more likely, a cluster of beliefs) are altered? A good place to start would be a definition; after all, we need to delimit the phenomena we want to model.


2. A Preliminary Definition. I want to define the theoretical notion of beliefs with daily life activities in mind. Consider eating practice. In most western cultures, people eat with a knife and fork, with the fork on the left of the plate and the knife to the right. In India, in most places, people eat with their right hands. Both of these eating practices are beliefs. What is common to them (and to all beliefs in our account) are three things – they are acts that are shared across a community, they are remarkably stable over time (compared to the time scale of the act of eating) and they come in tightly coupled clusters (for example, in western cultures, the decision to put the fork to left and the knife to the right is paired with other decisions about where the soup spoon goes etc). In general, I define a belief as follows:


Definition. A Belief is a guide to a stable social act directed towards a community such that:

  1. Generically, the Believer wants the community to share the contents of that belief

  2. Generically, the community wants to accept that belief.

  3. Beliefs come in clusters, a modal combination of acts that are interrelated within a larger frame (say eating).

  4. The goal of the community is to make sure that each cluster of beliefs is transmitted successfully across time.



In the next few sections, I flesh this definition into a typology of the structure of beliefs. Our account of beliefs is based on a “structural level” of analysis, which assumes that human beliefs are not primarily created and acted upon for rational reasons, but they still have some underlying structure. The structural level of analysis makes the following key assumption:


Between the biopsychological levels of unconscious processing and the rational agent of economics and international relations lies an autonomous level of tacit, shared beliefs and acts that are socially enacted, communicated and justified using narrative structures.

Our goal is to elucidate the structure of this level. As I have mentioned before, the structure of beliefs can be divided into two: production structures and comprehension structures. The production and comprehension structures operate against the background of three kinds of constraints: sociality, narrative/ritual structure and normativity.


3. A Typology of Beliefs: Sociality, Narrativity and Normativity.


(a ) Sociality : Every belief comes with a (perhaps tacit) community that is a potential audience for the belief. When you ask your wife whether she has seen your favorite coffee mug and she replies that it is next to your computer in the study, you are both sharing a belief that has the family (where, for example, everyone knows the identity of the favored coffee mugs) as its tacit community. When talking to a stranger on the train about the death of Michael Jackson, you are sharing beliefs that bind almost everyone on earth. These beliefs are mostly traded back and forth in informal social networks that belie whatever formal affiliations we might have. For example, nation states have many formal institutions and hierarchies. For example, during the periods when Pakistan is a democracy, the chief of army staff (COAS) is nominally subservient to the elected Prime Minister. However, as we all know, practice and precept are not quite the same in this case. More commonly, we are all aware of the power wielded by a Director or CEO’s personal assistant, a person whose official position in the hierarchy can be quite low. Informal connections go beyond the subversion of official hierarchies. They also reflect networks of patronage or friendship that stem from personal or community history. Once again, let us turn to Pakistan. We know that as a feudal society, much power is wielded by a relatively small number of landed families. These families have members in the army, bureaucracy and party politics and exert power without any explicit conspiracy to do so. The very fact that there are relationships of trust is enough to create influence.


To be more precise, a formal network captures the official link s between nodes/agents occupying (typically hierarchical) positions in a network of institutions while the informal network captures the relationships of trust, power and influence based on personal connections and shifting loyalties that mark actual human conduct. I think that each person is a member of a few (say, not more than seven) ‘typical’ set of social networks: family, friends, work, religious affiliation, hobbies/sports, city, state, nation. Sociality is primarily a constraint on the production of beliefs, i.e., the producer of a belief has the intended community in mind.


(b ) Narrative structure . In formal networks, we can model the patterns of influence using game theory, i.e., in terms of explicit bargaining between institutional actors. Explicit goals and strategies with explicit payoffs dominate the analysis of formal networks. Informal networks on the other hand may not be so concerned with payoffs as much as they are constructed around various forms of story-telling. As anyone who goes to a South Asian bazaar knows, bargaining is itself ritualized and part of an elaborate narrative structure. One can think of the narrative structure of beliefs in terms similar to the Gricean axioms for language use, i.e., a pragmatics of belief propagation with the following principles:


  1. Communities share a common narrative. These narratives are non-accidental features in the sense that the story is a highly unlikely belief in the space of all possible beliefs which is why they are easy to remember and propagate quickly through a community.

  2. Small sub-communities are the locus of change. In other words these sub-communities are the people within the community who hold the communal narrative explicitly, and they are also standard bearers of this communal narrative. The common narrative flows from them to the community and back.

  3. Most communities replicate their communal narrative in each generation, mostly without change. However, within communities that are under stress, these small groups can have in-group pressure to compete with each other since they all share the same beliefs. This may lead to the production of additional narrative elements or of a radical revision of the common narrative.


An important reason for the power or narratives in informal networks is the density of connection s. In formal networks, agents do not know much or share much with other agents in the network. Everything that they know about each other is from painstaking and explicit fact finding. The Cuban missile crisis is a good example of a situation where the cultural and political distance between the two sides was such that there were no established stereotypes or patterns of engagement. The India-Pakistan situation is rather different. The two sides share a long and contentious history, ways of thinking and other commonalities that make the gestalt laws of behavior far more applicable. Here, political narratives about ‘us’ and ‘them’ are as important as explicit goal setting.


Narratives themselves have a rather complex typology. A full typology of narratives is beyond the scope of t his essay, but I highlight five different narrative strategies that correspond to five different types of beliefs:

  1. Temporal narratives: These are narratives that consist of a sequence of events recited in temporal order such as “ This Sunday, I need to go shopping for shoes and I also need to make a trip to the grocery store.” In temporal narratives, there is no logic or overt cause connecting the different events in the narrative.

  2. Causal narratives: These are the narratives of causation in common sense as well as scientific reasoning, like “Mosquitoes cause Malaria.”

  3. Rational Narratives: These are narratives that adduce reasons for events being the way they are such as “He could not have murdered Mr.X since he was out of town that day.”

  4. Habitual/ritual narratives: These are narratives that recite how something is to be done because of an established protocol or because of cultural tradition. Recipes, ritual acts and daily routines fall under this heading.

  5. Analogical narratives: These are narratives of the form “X is so and so, because X is like Y,” for example, when you say “John has his fathers temper.”


The production of beliefs is determined by a combination of these narrative structures. Religious beliefs, for example, can be a complex combination of causal (god created the world in seven days), rational (thou should not) and ritual (going to Church on Sundays) narratives.


(c ) Normativity . If sociality and narrativity are about the production of beliefs, normativity is primarily about their comprehension. We are constantly evaluating beliefs according to tacit norms of conduct. Here, I single out three norms of belief evaluation: acceptability, certainty and sacredness.


  1. Acceptability: Every belief/act is evaluated for how acceptable it is in a given social context. Ties and shoes are important for best men but not on the tennis court.

  2. Certainty: We evaluate every belief according to the degree of certainty we grant to it. We are far more likely to believe that it will rain today than about winning the lottery today.

  3. Sacredness: Every belief is rated for its value to our general conceptual/emotional system. Some beliefs are entirely negotiable – whether it will rain today or not being one – and others are entirely non-negotiable – religious beliefs for example. Correspondingly, we might reason about beliefs using different patterns; we are utilitarian about commodities we buy in a supermarket, but very careful about our children’s education.


There are perhaps many other tacit norms that come into play in the evaluation of beliefs, but the general idea remains the same: beliefs are produced with a narrative structure and target audience in mind and evaluated with a battery of norms suitable to the context. So the macro-typology of beliefs as I conceive them looks like the figure below:

4. Discussion and Conclusions. Normally, beliefs are individuated according to their content so that the usual classification takes the form: religious beliefs, beliefs about nature, beliefs about social relations etc. Our approach is more abstract; beliefs are classified structurally according to abstract principles that are common to beliefs independent of content. For example, when evaluating the degree of sacredness of a belief, one person might concentrate on its religious foundation while another might look for secular values such as its environmental sustainability. Further, an inviolable value in a given context (such as brushing teeth before bed) might not generalize at all to other social contexts. Nevertheless, all of us evaluate a belief for its sacred value in a given context. Therefore, we can be confident that sacredness is part of a cognitive typology of beliefs. The same argument applies to the other structures: sociality, narrativity and normativity. All of these appear essential to the cognitive structure of beliefs.


Since beliefs are remarkably varied, our typology is only a first step in uncovering the cognitive structure of beliefs. A further refinement might involve the kind of action/knowledge guided by the belief. For example, beliefs guide daily rituals (you might always shower before going to bed, while another person might shower first thing in the morning), common sense knowledge (clouds of a particular color might be seen as rain bearing clouds), stereotypes (you might prefer one neighborhood grocer to another because of the perception that the preferred one gives you the best vegetables) and various preferences (for food or clothing, for example). Some of these forms are tied to knowledge (common sense beliefs), while others have no truth value even in principle (food preferences for example), some others are guides to action (such as daily rituals) yet others are dual encoding (stereotypes regulate knowledge as well as action). The outcome of a belief is a dimension that can be added to the typology in figure 1. While the details of the typology will change, the principles underlying the typology remain the same: the structural categorization of beliefs should be based on the regularities underlying daily life phenomena like eating food, driving to work, saying your prayers, making phone calls. The world hangs together just fine for most people most of the time since it is “deeply regular”. The science of beliefs should be about the study of these deep regularities, of which our typology is a first, rough analysis.


To summarize, the typology of beliefs outlined in this essay makes the assumption that beliefs are fundamentally social mental states The sociality of beliefs makes narrativity a crucial condition for producing a belief and normativity a crucial criterion for evaluating a belief. However, beliefs are not entirely social; they are also tied to knowledge and action in the world of physical objects. For beliefs tied to knowledge, truth is another norm. For beliefs tied to action, effectiveness is a relevant norm. The typology of beliefs raises some natural questions about the other norms that regulate beliefs such as “what is the relationship between acceptability and truth?” and “how do unacceptable beliefs become acceptable and vice versa?” These are questions that point the way to a larger cognitive and computational exploration of belief.



Enhanced by Zemanta

Understanding Regularities 5: Regular Gestalts

June 21, 2011

It is easy to think of a regularity as a platonic ideal, an eternal form that regulates a particular phenomenon or process. However, the platonic regularity is not the view I have in mind. In fact,  a regularity is a pattern that regulates the response of an organism to  its surroundings in the here and now. It is a momentary grasping that passes as soon as the object is grasped. There is no such thing as an abstract roundness as a regularity independent of the round ball in my palm. At  the same time, the roundness of the ball is related to the roundness of apples. The universal inheres in the particular, as the Naiyayikas liked to argue.

In physics, it is possible to reify regularities and make them autonomous entities. We can make Newton’s laws into universal principles that live outside of time and space. Whatever the merit of that view, we cannot afford to adopt that position in organismic biology. As far as any organism goes, what is real is what is present to it in the here and now. Regularities are part of the furniture of the organsims immediate universe. Equally importantly, regularities never present themselves in isolation. Newton’s laws can be separated from other laws that regulate a piece of matter, such as the laws of thermodynamics. However, a regularity always appears in conjunction with other regularities in an organism’s environment. Roundness never appears by itself, it comes with a certain texture and density in balls, which differentiates balls from apples. The roundness is never fully separable from the other regularities.  Regularities are always part of Gestalts.

Let me end with a quote from Wolfgang Kohler’s “An Introduction to Gestalt Psychology” (p. 62):

Our view will be that, instead of reacting to local stimuli by local and mutually independent events, the organism responds to the pattern of stimuli to which it is exposed; and that this answer is a unitary process, a functional whole, which gives, in experience, a sensory scene rather than a mosaic of local sensations.

A Room of One’s Own: The Where of Emotions

June 21, 2011

1. Introduction. Emotions are everywhere, or so it seems. Antonio Damasio talks about the importance of emotion for reason. Martha Nussbaum talks about the importance of reason for emotion. Yet, there are reasons to think that emotions are the most private, the most inner of our experiences. A pain or a colour can be pointed to; if we are asked where is it hurting, we can say ‘there.’ However, it is much harder to answer the question ‘where are you angry?’ The location of the anger might vary from moment to moment and person to person. Emotions are far more dynamic: roses might always be red, but I am not always blue. Indeed, one can define outer space as the space of all locations that can be pointed to – itself a privileging of the sensorial, especially the senses of vision and touch. Yet, space might not only be sensory space, the space of objects. Emotions, like thoughts and our eyes and fingers, are pointers; and they cannot point to themselves. We can define the distinction between inner and outer space as the distinction between the pointer and the pointed. While the geometry of the pointed is directly available to us, the geometry of the pointers is also a genuine spatial geometry, which is why I think that emotions are always somewhere and that the spatiality of emotions is a useful window into the relation between inner and outer space.

Let me start this piece with an invocation of a seemingly unrelated problem, i.e., the intractability of subjective consciousness. The argument goes as follows:

  1. We are fully, certainly aware of our own consciousness.

  2. We are infinitely far away from knowing the subjectivity of others.


  1. Inner space and outer space are permanently divided from each other.

  2. The inner is really not a space at all, it is in fact, a disembodied non-spatiotemporal soul. Or, in the modern David Chalmers’ style argument, consciousness is an independent dimension of existence, like space and time.

I am aware that I am condensing a whole range of arguments and subtle differences into one, but I do believe that the core argument schema in all of these arguments is similar enough to the above one that we can be happy with the caricature. Think of this if you will as an argument prototype where different particular arguments can be derived by metaphorical extension. Let me now recast this argument in a geometric form:

  1. The Geometry of Outer Space: continuous, indivisible and unlimited

  2. The Geometry of Inner Space: discrete, monadic, severely limited

→ G O ≠ G I and therefore the two have nothing to do with each other.


2. Counter Arguments: The role of motion and emotion in the constitution of space.


If we were really to think of emotions as like bodily tugs or stabs or flashes, then we would precisely leave out what is most disturbing about them. How simple life would be if grief were only a pain in the leg, or jealousy but a very bad headache. Martha Nussbaum, Upheavals of Thought, Page 16.

I am now going to present two counter arguments against the division of inner space and outer space, with the intent of dissolving the distinction. An analysis of emotion will play a crucial role in dissolving the distinction.

Case 1: Is the stinging bee angry? Counter argument 1. Emotion is a bridge between inner and outer space. In fact, both motion and emotion serve the same purpose, i.e., as a bridge between inner and outer. Main points:

  1. Space is constituted through motion and emotion.

  2. Consider the following diagram:






Consider a man standing in front of a scene, surveying it with a cool eye – perhaps Descartes contemplating the world in those moments 1 when the furnace becomes claustrophobic, or Cortez surveying the empire of the Inca. The world appears in a uniform, outer geometric light – distantly arrayed, frozen in time (before the invading hordes destroy it forever!). Let us call this the “light map”. Now consider a different man: older, his eyes are fading and he has to walk with a stick. He hobbles from frame to frame, holding on to the walls and the furniture, stopping to rest every once in a while. The floor is too warm, but the walls feel cool to the touch, perhaps the cane furniture is too rough for him to drag his hands. Which one of these is the ‘real’ space? The correct answer, as I ‘see’ it is – both!

What if inner and outer space were replaced by a cluster of topographies? We still accept that there is a distinction worth making between inner and outer, but we replace the absolute distinction by a family of interrelated spaces, each mediated by a dominant sense – vision, touch, proprioception etc. Let us call these interrelated spaces a topographic cluster and each map within that cluster is a spatial map: a light map (for vision), a heat map (for touch) etc. Note that these maps are not in the brain (as topographic maps are typically assumed to be) but out there in the world. Both emotions and motions are acts that bind these topographic clusters together. Some spaces are inviting, others are creepy. Each emotion binds the different topographies into a single readiness to act. Each motion binds the different topographies into a single performance. This interplay of action potentials and acts is key to understanding motion, emotion and space.

So what does the picture look like?

Before: Inner space and Outer space.

After: A topographical cluster of maps bound by emotion and action.

Which leads to a question: how does emotion act as a binding agent?


Case 2: The Haunted House. Imagine walking on a dark, rainy night in a secluded part of town. You see an abandoned nineteenth century house shrouded by tall trees. The yard of the house is littered with grotesque sculptures. As you approach the house, the wind starts picking up; at the edge of the plot, a gargoyle greets you with open jaws. A flash of lightning strike the turrets and the accompanying thunder is louder than anything you have ever heard before. Your heart races, your hair stands on end, and even without realizing, you are running as fast as you can away from the house. A few hundred metres down the street you start slowing down. In the distance, the house appears run-down but benign. You shake your head, grin to yourself and keep walking.

What is the moral of the story?

  1. A potential action is an actual emotion.

  2. A potential emotion is an actual action.

  3. Space is constituted by situations: complexes of potential and actual (e)motions and motives.

  4. None of this is in your head.

The actual/potential axis explains how emotions and actions bind topographic maps. The key theoretical construct is that of a situation. A situation is exactly what you might expect: a combination of objects and events in the world in which an organism is embedded. For example, if you are walking in a forest and a cobra rears its head in front of you, you are in a situation. Every situation calls forth a unique topographic cluster. The fear you feel in front of that cobra is constitutive of that situation. That fear leads to action – stepping back slowly (good!), running away (bad!). It is the fear (actual) to running (potential) axis that binds the various maps into a topographic complex that defines the organisms’ response to the situation.


Summary: Situation 1→ (emotion)→ Topographic Cluster → (action)→ Situation 2

3. The role of space in the constitution of emotion. So far, I have only talked about the role of (e)motion in the constitution of space. However, the opposite is equally true.



Aristotle once asked the following question: how do you whether you are seeing or hearing or touching? Similarly, we can ask: how do you know whether you are afraid or happy or sad? The above clips from Asterix give us a few clues: each emotion is associated with a class of situations (where that emotion is reliably evoked) and in each such situation, there is a topographic cluster in which the emotion has a spatial footprint (metaphorically speaking). For fear, there’s the combination of the threatening object, hair standing on end, stomach churning etc. These visual/gustatory/proprioceptive maps are constitutive of fear. The emotion then is constituted by elements of the current, actual situation and elements of future, i.e., potential situations. Here, it is the shift The bodyscape embedded in the landscape is as much a part of emotion as the emotions are part of the body/landscape. To summarise, there are two pictures:

  1. The topographic cluster picture – in which emotions and actions play a binding role via the potential/actual axis running from emotion (actual) to action (potential).

  2. The affective cluster picture – in which space plays a binding role through via the potential/actual axis running from situation (current, actual) to situation (potential).

Another way to put it is as follows. Think of Self, (E)motion and World as a triad (see figure below).






Then, we can cluster this triad in two distinct ways: from the World to the self and from the self to the object. In the former, we start with situations that trigger topographic maps that in turn are bound by motions and emotions. In the latter, we start with motives that trigger affective clusters that in turn are bound by situations. Perceptions and emotions have complementary roles (Martha Nussbaum points out a version of the latter) in that perceptions arise from (constituted by?) the self going out to meet the world, while emotions arise from (constituted by?) the world coming to meet the self. In the former, the world is the foreground and the self is the background, which is why we see objects and not the self, and in the latter, the self is in the foreground and the world the background, which is why we feel the self and not the object. There is no point asking which clustering (i.e., world to self or self to world) is more fundamental; they are just two poles of the self-world axis.

Two more arguments:

The Colour Analogy: What if emotions are to the body as colours are to objects? Are colours spatial? One the one hand, it seems as if the redness of a rose has nothing to with its shape or spatial distribution – a rose chopped up into a million pieces will still be as red – but on closer consideration colour is always co-present in space. In fact, we can argue that colour inheres in a spatial locus. Similarly, we can argue that the anger of the anger is not located anywhere in the body or in response to any single situation, but we can also argue that emotions always inhere in a spatial locus. There is a major difference though: emotions are fleeting while colours are somewhat more permanent. Nevertheless, note that the analogy seems to have some traction; after all we do label emotions using colours (Red-hot angry) and colours with emotion – a calm blue.

The indexical argument. Whatever else one might say about an emotion, we can always be assured that it has an “I,” a self to which it is attached. There is anger, but it is always my anger or your anger – even if it is the same emotion in both of us. Further, there is no such thing as an “I” that doesn’t have a location. After all, the referent of the linguistic term “I” is either you or me depending on the fact that I am here and you are there. Spatiotemporality is central to the “I.” While we could argue that the spatio-temporal location isn’t essential to the self, it is nevertheless necessary. The relation between emotions and space-time can come under the category of relations that are necessary without being essential (note the logic of modality once again). In general we could argue that the relation between space, self and emotion is within this category of necessary but non-essential relations. So the argument goes as follows:

  1. Emotions have an index.

  2. Indexes always have spatio-temporal location.

  3. Therefore emotions have a spatio-temporal location.

4. Take-home messages:

  1. Three key concepts: cluster categories, situations and the potential/actual axis.

  2. Inner space and outer space are not distinct. In fact, we should replace them with topographic clusters that are bound together by (e)motion.

  3. Emotions and space are not distinct. In fact emotion is constituted by an affective cluster that is bound by situations.

  4. The logic of modality and the logic of (e)motives are closely interrelated.


5. Conclusion. I started this essay with an argument schema that creates an ontological divide between inner and outer space. Then, I presented arguments to show that this divide is not tenable. We are now ready to revisit the original question. If we accept the argument that inner and outer space are to be replaced by topographic clusters on the organismic side and situations on the world side, what happens to the problem of subjectivity? The answer is that we should replace the certainties of our own experience and the radical doubt about other minds to the actuality of our experience and the potentiality of others. Our minds are available to each other (potentially) even if they are actually not present to us now. However, note that even our own minds have a version of this problem. If we think of inner space as being ‘in touch with oneself’ then vision is a particularly bad way of self-knowing, for it has no access to the self. To the extent we can touch others, we know their inner space as well. The hard problem of consciousness is -like many other seemingly intractable metaphysical puzzles- an artefact of the theoretical primacy of vision as anything else.

In the twentieth century, philosophers thought that they would reduce metaphysical problems to problems of logic and language; this faith in logic and language is shared by otherwise radically different philosophers from Russell and the early Wittgenstein to the logical positivists to the late Wittgenstein, the Behaviourists and the ordinary language philosophers. All of them claimed that the misuse of logic and language are responsible for a host of false philosophical problems, from the problem of existence onwards. Their hope was that a suitable re-description of the problem in a logically precise language or a careful analysis of ordinary language will make these problems disappear. In the late twentieth century and early twenty-first century, language and logic have been replaced by the mind, i.e., many philosophers and neuroscientists now claim that classical metaphysical problems will either be eliminated or reduced to scientific questions in neuroscience and cognitive science. From the nature of religion to the origins of mathematics, neuroscientists and cognitive scientists believe that the foundations of human existence are to be found in these fields. While I think we always learn much from a deep engagement with nothing but-tery, we need a method to engage with metaphysical problems not by thinning them down, but by thickening them.

I would like to propose a different method to transform metaphysical problems. I believe that many of these problems arise not from a mistaken use of logic or language or a poor appreciation of brain science. Instead, I think that many of these problems arise from an oversimplification and underestimation of the complexity of the human world. What we need are not reductions but enrichments; of understanding the web of relations that connect body-mind and world and the ability to expand reductive concepts like soul, certainty etc to thicker, fully fleshed form. Aristotle starts his physics by saying that we should first understand the principles of any domain we want to investigate; by principle, he might have well meant the methods and the support of an investigation as much as its laws. Here, I am arguing that the human world is a good principle; it is the support of any investigation into metaphysical problems. Instead of reducing these metaphysical problems to language or brain science, we should enrich, expand and then release these problems into the human world. A version of this project that is an enriched language philosophy is to use metaphor to expand the range of linguistic supports of a metaphysical problem rather than to use formalisms and syntactic considerations to reduce metaphysics to logic or grammar. Once that is done, these classic problems in metaphysics will not disappear, for that would be tragic as well as boring. Instead, they will become fertile territories for a combination of philosophical, scientific and humanistic investigation. I hope that my thickening of the problem of consciousness has convinced you – or at least piqued your interest – to shift gears from reductive principles to expansive principles.

1 It is no surprise then that the division between soul and body is tied to Descartes’ experience of the tension between two spaces – the claustrophic furnace and the open but dangerous external world with the threat of persecution awaiting at the doorstep.


Cognitive Regularities 2: Perceptual explanations

June 20, 2011

The last fifty years have seen a great expansion of our knowledge of the mind/brain and it relationship to the external world. Significant advances have been made on the experimental front, in areas as diverse as Psychophysics, Neurophysiology and Cognitive Psychology. However, the corresponding advances in theory have not materialized. Some of the disciplines within the mind/brain sciences have developed deep theories of the corresponding mental faculty- linguistics and vision being two notable examples. However, these theories are specific to the particular mental faculty and do not generalize well. Ideally, a complete theory of the mind/brain should provide an unified account of the mind/brain within an explanatory level and also across explanatory levels1.

The main claim being advanced here is that mental systems as well as the world are driven by “strong” regularities. Consequently, there is no principled reason to think of mental processes as being different from world processes, i.e., there is no special mind-world barrier. It seems quite possible that the underlying dynamics of mental and world processes are constrained by the same regularities. If so, one of the tasks of the cognitive theorist is to explicate these common principles2.

That is to say, we should have a theory that leads to horizontal as well as vertical unification. By horizontal unification I mean a set of principles that capture the inherent structures that are common to the various subsystems of the mind/brain of an organism as well as its environmentat a given level. The problem of vertical unification is that of relating principles across levels. In this piece, I restrict myself to the question of horizontal unification at one particular level, an abstract level that I call natural structure.

I prefer to call this level “natural structure” instead of the often used terms “computation” or “information processing” for a reason. Although we speak of computational constraints, these are often driven by the underlying generative processes and the structures that support these processes. Computation as it is usually conceived does not always capture the inherent form of these processes. For example, many natural constraints are geometric in nature, for example, finding part boundaries in images or describing the shape of a smooth object (Koenderink, Hoffman-Richards). Another example is the relationship between the size of an animal and the speed at which it walks/runs (d’Arcy Thompson). All of these constraints are modal regularities (Richards). There is a common underlying architecture behind all of these structures and processes. Some of these processes may well be computational, but others may not be so. Nevertheless, at an abstract level, they have the same form. The emphasis on computational schemes or on representational forms is mistaken. Mental structures should be regarded as explanatory structures that are not tied specifically to logic, language, pictures etc., but rather to models that are constrained by the intrinsic regularities of the mind and the world.


2. The Main Assumptions. The arguments here rest of six assumptions that I think are common to all good models, whether they be perception subsystems or a scientists model of some aspect of the world.

  1. The world is regular: These world regularities require an explanation.

  2. Generative Processes: The regularities are a result of generative processes.

  3. Closed World: In a given model, there are a set of variables, quite likely a small number, that capture all the relevant regularities.

  4. Universal Principles: There are universal principles that determine the set of generative processes.

  5. Symmetry within the closed world: The universal principles apply uniformly to all objects in the closed world.

  6. Horizontal levels: Every model has a particular scale3of operation, i.e., it is valid only at a certain scale or set of scales. The scales are part of the definition of the closed world. The universal principles are valid only for those scales.

3. An Outline of the Common Architecture:How do these assumptions translate into a proposal for the study of the mind-world relationship?In the approach taken here the basic architectural unit of the perceiver/world is a quasi-modular (QM) process. Each QM process consists of a frame, a strong dynamical procedure and a feature lattice (of non-accidental features). The main difference between world processes and mental processes comes in the relationship between the feature lattice and the dynamical procedure. The dynamical procedures in world processes causally produce regularities that are then organized into a structure lattice, while mental processes are “explanations” of features in a preference lattice. Since mental processes have to perform in real time in a rapidly changing environment, they are selective in what they choose to explain and are also highly context sensitive. In any given situation, there are many processes that are interacting simultaneously. The interactions take place at interfaces. Each interface is a map between the feature lattice of one QM process to the feature lattice of the other QM process. In particular, the structure lattice/preference lattice correspondence is the analog of representation of world regularities by mental regularities. Note however, that the dynamics of the system is not driven by the feature lattice correspondence.

The dynamical account of mental function has many advantages over the representational account. It takes for granted that the mind as well as the world are highly structured entities with striking regularities of their own. If that is the case, it becomes impossible to hold on to the notion that mental regularities mirror the world regularities, as the representational account demands. It seems to us that in order for the structure of the organism to be isomorphic to the structure of the world, a very impoverished version of organismic and world structure would have to be true4. Linking the mind and world by means of the correspondence between the structure and the preference lattices is a way to understand and quantify the mind-world relationship without imposing the extra burden of representation. The structure/preference lattice correspondence keeps perception/cognition robust since the explanatory processes are tied to non-accidental features in the preference lattice. Non-accidental features in the preference lattice correspond to non-accidental features in the structure lattice that in turn is tied to processes in the world. Therefore, the processes postulated by the new theory do not change the survival value of perception/cognition for the organism.

Remark: Postulating the existence of a computational level seems like a restatement of the hardware-software distinction, or of the existence of different levels of analysis. In both of these cases, the reason for abstraction is epistemological and practical, i.e., it is assumed that we abstract away from reality because information processing is hard to discover amidst all the biological and physical details. However, not all abstractions are epistemological in nature. For example, Newton’s laws of motion are also abstractions, but everybody thought that they described the world directly. Let us call the two kinds of abstraction described above as weak abstraction and strong abstraction respectively.

The centrality of computational structure/design implies that it is a strong abstraction and that describes an aspect of nature5.

Every process takes place in a spatio-temporal framework6. The range of possible spatio-temporal frameworks is enormous, from concrete objects, e.g., a canvas for painting pictures, to abstract mathematical objects like vector spaces. A frame is a spatio-temporal framework that is anexplicitembodiment of certain aspects of the spatio-temporal structure of the world/perceiver. It serves two purposes, the first of which is to provide a setting in which processes, actions and events can take place. Second, because it makes some computational objects explicit, it filters out unnecessary elements of the world/perceiver. In other words, only those objects that are made explicit can participate in a process that is enabled by a given frame. To make a statistical metaphor, a frame is a meta-prior that constrains and determines the space of all acceptable hypotheses. For taxonomical purposes7, we can divide frames into two kinds- world frames and mental frames.

A prototypical example of a world frame is a sand dune in a desert. Many dynamical processes take place on the surface and interior of a dune. So, what is the explicit structure of the dune? The explicit structure of the dune is a three dimensional co-ordinate frame along with the relative location of the particles of sand in the dune. In order for a dynamic process to leave its mark on a dune, it needs to move some particles of sand. For example, a lizard walking on the surface of the dune leaves a trace while sunlight does not. Note that this frame is inherently computational. The actual physical make up of sand is irrelevant. A computer simulation of small, three-dimensional particles affords the same processes as a real dune. Good examples of mental frames are image centered and object centered reference frames for representing objects. A more interesting example is the canonical vertical axis imposed by human’s prior to the formation of the percept (see figure 1).

Most frames support both static and dynamic procedures. For example, three generic points in two-dimensional space form a triangle, an example of a relationship that is satisfied by the three points. A relationship is a static procedure since the process by which the relationship came into being is not specified in the relationship itself. At the same time, the three points could be a product of a transformation applied to some other three points, in which case, the relationship is an outcome of a process. We can roughly classify procedures as follows-

Type1 procedures: These consist of nothing more than a frame F and individual objects supported by the frame, for example, three points in two dimensional space, where the relationship between the three points is not explicit.

Type2 procedures: Here the procedure consists of some objects, O, and a set of relations, R, where each relation is a predicate over the set of objects, for example, three points in two dimensional space that (explicitly) form the vertices of a triangle.

Type3 procedures: In a Type3 procedure, P consists of a generative procedure that generates the given relations between objects. Assume that the triangle in a type2 procedure is produced by stretching and dilating a standard equilateral triangle, whose sides are all of length 1. In this case, the generative procedure consists of the “stretch” and “dilate” transformations.

Type4 procedures: A Type4 procedure is the strongest possible procedure. Not only do we have a generative procedure but also the underlying dynamics is known, i.e., we know the mechanism that leads to the generative procedure. In the triangle example, this is equivalent to saying that the equations that generate the stretch and dilate transforms are known, along with the order in which they are applied to produce the final triangle. The dynamical mechanism is a meta-procedure, that generates the given generative procedure in a type 3 procedure in the context of the frame F. From now on, type-1 and type-2 procedures will be called static procedures while type-3 and type-4 procedures will be called dynamic procedures.

Within a given frame, relationships between objects or events can be divided into two classes- generic and non-accidental. Generic relationships are those that have a high probability of occurring, given the constraints imposed by the frame. Non-accidental relationships are those that have a low probability of occurring in the context of the frame. Consequently, non-accidental features are strong indicators of further constraints in the form of static or dynamic procedures. For example, three points in 2D space are generically not collinear. Therefore, co-linearity is a non-accidental feature of three points in 2D space. Some features are more non-accidental than others. For example, if we take four points in 2D space, the generic relationship is that of a quadrilateral. A rectangle is a non-accidental configuration, since the opposite sides are parallel. However, a square is even more non-accidental since all the sides are of the same length. The features in a given frame 8 can be arranged into a lattice where the features lower in the lattice are more non-accidental than the ones above.

Definition. A dynamic process is a triple, (F, P,L), where F is a collection of frames, F , P is a collection of dynamic procedures and L is the lattice of generic and non-accidental features in the frame F.

So far, all the concepts have dealt with an individual QM process. However, in a typical situation in the real world, there are many different interacting processes. The architecture of QM processes should somehow reflect the fact that multiple interactions are the norm. This is where quasi-modularity comes into the picture. A modular process is one where the procedure is independent of any input coming from interactions with other processes, i.e., it is an automatic process. Modular processes are not very context sensitive and do not have the flexibility that is demanded in a rapidly changing environment. However, they are very robust since their input-output mapping is very well defined. As a consequence they can be used as components in a variety of complex tasks. A quasi-modular process is a generalization of a modular process that retains the robustness of modular processes while being flexible. In a quasi-modular process, the triple (F,P,L) comes with a finite (usually quite small) number of switches. Each switch opens a frame or starts a dynamic procedure. For a given process, Q, at a given time, the processes with which Q is currently interacting determine the set of “on” switches.

In particular, if the number of switches is zero, we end up with a modular process. Quasi-modular processes are highly robust since the dynamic procedures themselves are immune to change from external interaction. However, they are highly flexible since a small number of quasi-modular processes are enough to meet the combinatorial demands imposed by a changing environment 9.

Finally, we come to the relationship between the dynamic procedure, P, and the feature lattice, L. Here, we see the difference between world processes and mental processes. In the case of world processes, the relationship between non-accidental features and the dynamic procedures is very simple- each non-accidental feature is caused by a dynamical procedure, i.e., the dynamical procedure leaves the non-accidental feature as a trace. In the case of mental processes, the relationship is quite a bit more complicated. The origin of the complicated relationship between non-accidental feature and mental process stems from the fact that the mental process is designed for use by the organism. Therefore, the set of non-accidental features made explicit by a mental frame is much smaller and the relationship between the non-accidental feature and the mental procedure is that of explanation, not causation. Explanation is a concept that is relatively hard to pin down, however, a couple of things can be said about any explanatory process-

(1) An explanatory process is typically very selective in what it chooses to explain. However, if a feature is explained, it is likely to be a highly non-accidental feature. That is to say, the more non-accidental a feature gets, the probability that it is explained becomes greater. This makes good sense since at any given time, there are many world processes that are producing regularities. However, only a few are of importance to the organism and they are the ones that are worth explaining. Another way of stating the selectivity of explanatory processes is that they index into the right mental frame and therefore, into the right non-accidental features. Indexing is act of choosing the right mental representation or mental routine for a given stimulus. For example, if we are looking at an object, what is the feature worth noticing- the overall shape, some distinctive part or feature, or some other attributes like color or location? Each one of these features is non-accidental, so we have to make a choice between equals. It might well be possible to explain indexing using quasi-modular processes. In a large, interacting network of quasi-modular processes, it is still computationally feasible to access the “correct” frame because of the way the architecture is constrained 10.

(2) A good explanation is not the best inference from the set of non-accidental features. In particular, a good explanation is not always causally related to the non-accidental features it explains. There is a history of thinking about perception as a process of unconscious inference, starting from Helmholtz. In recent times, the role of inference has been stressed by many authors, e.g., Irvin Rock. Inferential processes are chains of counterfactual reasoning driven by non-accidental features. There is a causal relationship between an unconscious inference and the non-accidental feature, with the non-accidental feature acting as a cause. However, explanatory mechanisms do not have to be in any kind of causal relationship with their key features. For example, an explanatory mechanism could be a generative procedure driven by internal constraints that is triggered/indexed by a non-accidental feature. Consequently, the generative procedure could be largely independent of any non-accidental features. As a result of the relative independence of the non-accidental features and the explanatory procedure, a good explanation is more robust than a good inference. A good explanation is not causally related to any particular set of non-accidental features and it does not break down when there are non-accidental features that give rise to contradictory inferences. To give an example, let us take two explanations of planetary motion.

  1. Every planet traces an ellipse as it revolves around the sun in such a way that the area covered is the same for a given interval of time, independent of the position of the planet in its orbit (Kepler’s law).

  2. The motion of the planets around the sun (among other things), is governed by the law of universal gravitation.

The first is an example of inference from observational data while the second is not11. In fact, the second is a classic case of indexing into the right frame for explaining a problem by ignoring a lot of conflicting data. To go back to the notion of indexing, if the architecture of quasi-modular processes solves the problem of indexing into the right frame, there is no need to rely heavily on inference any more. However, in order to index into a dynamic procedure, it is important that the environment of the perceiver not be impoverished. In the absence of robust non-accidental features in the environment, the mental system may well index into an inferential mode, since that is the safest strategy. When robust non-accidental features are available, the mental system switches on a dynamic procedure because explanation is always better than inference in the presence of robust data12.

4. Evidence for the dynamical approach.The dynamical approach makes four strong claims about the structure of the perceiver and the world. They are-

  1. The world as well as the mental system consist of a collection of interacting quasi-modular processes.

  2. The architecture of world processes is quite similar to the architecture of mental processes. Both are triples, (F, P,L), of frame, dynamical procedure and feature lattice respectively. Consequently, there is no principled distinction between mental and world processes.
  3. The main difference between mental processes and world processes is the relationship between the dynamic procedure and the feature lattice. While world processes leave non-accidental features as causal traces, mental processes “explain” non-accidental features.
  4. At any given time, there are many processes, both world and mental, that are active. Pairs of processes can interact only at an interface, which is a map between the feature lattices of the two processes.

Claim 2 is the strongest claim of the four and the one that I cannot do much justice in this piece. Definitive evidence for this claim can only come from showing the power and elegance of QM processes as an explanatory framework. At a descriptive level, claim 2 holds for every mental sub-system. Every mental process that we know is highly robust, tied to a small set of non-accidental features while remaining context sensitive and interacting with a host of other mental processes. This is true of systems studied under the labels of depth perception, motion detection, object recognition, shape representation. Similarly, all world processes that impinge upon the perceiver are quite local, with a well-defined generative procedure. Statistical, geometric and logical principles have proved useful in all aspects of computational modeling of mental systems. Furthermore, they share the same structure13. The big question is whether the claim is true at a deeper level, i.e., are there common computational principles that apply to many if not all quasi-modular processes. This is a question that cannot be solved in a book, let alone one paper14. Evidence for the other three claims is easier to come by and I have gathered them into three subsections, one for each claim.

(1) It is pretty clear that every object in the world is the outcome of at least one process and participates in many other processes. In itself, this fact may not mean much, but the crucial observation is that the regularities (of objects etc.) in the world are largely an emergent property of the processes that shape them. Whether the objects be rocks in the middle of a stream, clouds in the sky or trees in a forest, each object has a characteristic shape that is entirely due to the process that caused it to come into being. One can come up with an infinite number of other examples all indicating that regularities in the world bear a strong imprint of the processes that cause them. Furthermore, for our purposes, it is equally important that these processes are all abstract, computational processes. The details of fluid mechanics are not important in determining the shape of a rock in a stream. A computer simulation of a fluid that preserves only a few of the physical properties of water produces rocks of the same shape. Fractal modeling produces shapes that are remarkably like real world clouds and mountains. This strongly suggests that the mental environment, and not just the perceiver, consists of processes that can be modeled at an abstract level.

Similarly, even the simplest percepts are part of a dynamical explanation, not a representation based explanation. One might think that perception is an explanatory process at the higher levels, when it overlaps with higher cognitive processes in general. Yet, it seems to me that even the simplest acts of perception involve some kind of dynamical explanation. The examples below illustrate my point.



In each pair, upon inspection, it is clear that the two shapes are the same. Yet, mentally they seem quite dissimilar. The most parsimonious explanation for the difference is that our visual system imposes a coordinate frame on the stimuli. The co-ordinate frame switches a different explanatory process each time and that leads to a different perception of form in each case. If indeed that is the case, two conclusions follow-

(i) The vertical co-ordinate frame is not a representation of some property of the stimulus or a simple inference from the stimulus. After all, the stimuli above vary drastically in their features and properties, so any process that leads to the inference of a vertical orientation has to be a complex process.

(ii) Our percept is an outcome of a process that involves the coordinate frame and the non-accidental features of the stimuli.

Similarly, consider the example below. Triangle A is nested inside triangle B. Figures 2b and 2c and 2d provide three examples of transforming triangle A into triangle B. Most observers would think of the transformation in 2b as being more natural than the ones in 2c or 2d. In 2e-2g, the same transformations are now applied to a set of nested curves. In this case, there seems to no clear choice of a natural transformation.

Figure 2

Examples like the one illustrated in figure 2 show a couple of things. First, we have a repertoire of transformations in our visual system. Second, in the presence of key features -in this case the vertices of a triangle- some transformations are more natural, i.e., some transformations are better explanations than others.

(3) It is quite clear that regularities in the world are traces of world processes. Whether it be a tree, a cloud, a chair or a building, every world object is an end point of a causal process. What is more important is that these processes can be modeled computationally using relatively simple universal rules. There is no need to get into the details of design, but it is true that the design any object in the real world can be replicated on a computer screen. The graphics industry depends on this fact. Of course, that is not to say that the physical process itself was replicated. What concerns us are the constraints at the level of design, not the actual process that was used to construct the object itself. In some case the two may be the same, e.g., the physics of sand dunes can be faithfully modeled in a computer simulation, but an isomorphism between the design level and the phyics is not necessary. For example consider automobile construction. Constructing a car is a process, but so is designing a car. The design process is not isomorphic to the construction process as the two have different constraints and causal trajectories. As it so happens, both processes end up producing the same item at the end, but that is not to discount the fact that it is the process constraints that largely regulate the end product and not the other way around 15. The study of the QM processes at a computational level is the study of the principles of biological design, albeit at an abstract level.

Similarly, there are numerous examples that show that perception is an explanation of a few non-accidental features in the stimulus and is not directly caused by the stimulus. Indeed, our intuition as perceivers seems to indicate that we represent the world in all its richness. It comes as a surprise that our representations are actually quite poor. It has been shown time and again that we neglect massive changes in the world, even if it happens in front our eyes (Rensink). The best explanation for the poverty of our representation is that perception consists of dynamic procedures that explain a few key features while filtering out everything else. Quick, process driven explanation of non-accidental features seems to be the norm rather than the exception. Furthermore, computational capacity seems to be irrelevant. Consider a situation where images are created using the LOGO program:

Figure 3

Most people classify the pictures into several different categories. It comes as a surprise that all of the pictures were generated using the same rule: take a line and rotate it at a fixed anglentimes, wherenis an arbitrary integer. Human observers are not able to use the underlying regularity to decode the generative process or to classify the various pictures as belonging to one class even though the task is not computationally intensive. Why is that so? After all, the probability thatnangles chosen at random are equal is much smaller than any other regularity that is present in the pictures. Nevertheless, we neglect this highly non-accidental feature in favor of others. In this case, perception is guided by internal processes tied to non-accidental features that are not necessarily the most important “world statistics”.

(4) The first part of the claim is trivial since it is obviously the case that in a natural environment there are many different processes going on in the world as well in our heads. As for the second part, it can be divided into two halves- interaction takes place only along interfaces, and the interaction is a map between feature lattices. The first half is the usual argument for modularity. Since this topic has been discussed quite intensively in the literature, there is no point in discussing it further.

All the novelty in claim 4 is in the second half. It is well known that non-accidental features in the image map onto non-accidental features in the distal stimulus. T-junctions map onto occlusions, minima of negative curvature map onto part boundaries and so on. However, the really interesting examples are maps between two mental processes. For example, consider the interface between the linguistic system and the visual system that is involved in reading written directions and then looking at a map to find the way -something most of us have done at some point in our lives. A typical written set of directions might say – “ Go straight on road X for about 2 miles, take a right one the fourth traffic light and then take a left at the next traffic light.” Most of us when reading an instruction like this try to find road X on the map, immediately start counting the intersections till we reach the fourth one and then jump to the next light on the road that is to the right. On the map itself intersections are always sharp discontinuities, usually right angles. In both cases, we are looking for non-accidental features. In the linguistic world they correspond to actions- turn left, turn right- and in the visual world they are sharp discontinuities in a map. We do not bother with the stretch of road in the middle whether we are reading directions or looking at straight pieces of a road in the map because it does not give us any useful information, i.e., it is a generic feature. This is even more striking when you realize that written directions rarely have extraneous information like “ The 2 mile stretch of road X is has this beautiful house on the right. Do not forget to look at it.” Directions are designed so the linguistic non-accidental features directly map onto visual non-accidental features. An obvious question is “how do we know if a feature is non-accidental or not? For all you know, the notion of non-accidental is truea posteriori, i.e., a linguistic term is non-accidental if it is mapped onto a visual feature.” What is really impressive is that actions are non-accidental in a frame that isintrinsicto language, while right angles are non-accidental in a frame intrinsic to vision. Therefore, the domain and the range of the linguistic-visual map are both well defined. No chicken and egg problem arises here.

4. Consequences of the dynamical approach. In this section, I use the dynamical approach to address two well known debates in cognitive science, namely, the contribution of innate knowledge and the role of representation. In the first case, the results are interesting but not surprising, while in the second, I believe it leads to a whole sale reevaluation of the importance of representation. Therefore, much more space is devoted to a discussion of the second topic.

(1) The role of innate knowledge. If the four claims at the beginning of section 3 are true, then the common architecture of QM processes impose severe constraints on individual mental processes as well as the perceiver-world system as a whole. In this sense, most of the structure is built into the system. However, the role of environmental input and learning during the lifetime of the organism is not to be minimized. Learning enters into the picture in two different ways. First, the correspondence between the feature lattices of two distinct QM processes is not determined beforehand. This has to be learnt by the perceiver and is clearly tied to the intricacies of the mental environment. Since the environment may be very different for two individuals selected at random, the mapping can between the same QM processes in the two individuals can be quite different. Second, each individual has to solve the indexing problem to his or her satisfaction. A quasi-module comes with a set of switches and an individual perceiver has to decide which frame is important in a given task, that in turn determines the switches that are turned on. The individual organism also has to order the different non-accidental features in a preference lattice. Furthermore, learning can result in a qualitative leap in performance. This is because quasi-modules that are connected by a robust interface can then be chained together to perform more complicated tasks, while weak interfaces will fail on these tasks. In this sense, in the dynamical approach, innate structure and learning are at two different levels. The overall structure at the design level is largely dictated by common architectural constraints while real time performance is molded by experience, sometimes strikingly so.

(2) The importance of representation. The term “ representation” is ubiquitous in cognitive science. There is no generally accepted definition of these terms but all representational theories make the following four assumptions about the relationship between a perceiver and his environment.

  1. The world, W, (the “distal stimulus”) is a collection of objects. Objects have properties and are related to each other both spatially and causally.

  2. The perceiver has access only to a projection of the world16, called the image or proximal stimulus, P.

  3. The Mental system of the perceiver consists of an internal representation, R. R is related to W in anexplicitmanner by means of a correspondence F:WR, that allows the perceiver to make explicit certain properties of the world W. The goal of the mental scientist, apart from any questions about the biological substrate of R, is to answer the questions: “What is being represented by R” and “What is the structure of R?”

  4. Representation is the primary goal of perception, i.e., the goal of the mental system is to use the proximal stimulus, P, to achieve a veridical representation, R, of W. Furthermore, the science of perception is the study of R and its relationship to W.

CR theories can be broadly classified into two types:

(a) Inverse optics (Marr, 1980). The goal of the perceiver is to invert the process that led to the creation of the proximal stimulus from the distal stimulus. Since this is an underdetermined problem, the perceiver imposes additional constraints in order to invert the image. In this scenario, inversion is unavoidably aprocess, and its end goal is to recreate the distal stimulus. Nevertheless, it is possible to separate the goals of the perceiver from the process that leads to the goal. For example, Marr separated the two by calling the study of the goals the “The computational level” while the process was part of the “ representational-algorithmic” level.

(b) Similarity based methods (Edelman,1998). Inverse optics turned out to be harder than anyone could have imagined in the late 70’s. As a consequence, mental scientists turned towards simpler methods of representation. The goal of the perceiver was no longer inversion of the image but rather to represent the relations/similarities between objects in the world in the form of distances between points in an abstract similarity space. The representation was to be effected in a way that there was an isomorphism between the distal relations that were represented and the geometry of the similarity space. Note that the similarity based method is inherently weaker than inverse optics, in the sense that the process by which the similarities are represented is relegated to a neural mechanism. Of course, being “weaker” in this sense made these theories computationally tractable and along with powerful new algorithms, they have led to advances in computer vision, psychophysics and neurobiology.

Is there a good reason to be so sanguine about the prospects of representation? I believe not. Given the richness of our sensory experience and the numerous ways in the senses confirm each other, it seems quite obvious that the goal of perception is to uncover the structure of the world. Nevertheless, this is a mistaken view of perception. The dynamical approach shows that representing of the external world is only a secondary aspect of perception. The primary purpose of perception is not representation but explanation.

The argument against representation consists of two main points, as follows-

(a)The world is process driven. Objects in the world are secondary at best and are sometimes very different from what we think they are. The first claim has already been addressed in section 3. As for the second, consider for example, the existence of natural processes that are multi-scale (like fractals) also show that the form/shape of world objects is often nothing like the piece-wise smooth bounded surfaces that we experience mentally. The fact that many (if not most) objects in the natural world are multi-scale is quite illuminating17. If representation is primary, it is surprising that we do not represent a highly robust statistic- that the “true” shape of real world objects is a multi-scale distribution. A consequence of multi-scale representation of shape is that the assumption of piece-wise smooth surfaces is completely wrong. However, we see multi-scale objects as piece-wise smooth objects plus some texture (Gilden).

Interestingly, there seems to be a clean break between human constructions -buildings, cars, chairs, tables etc., and natural world objects. Human constructions are invariably piece-wise smooth bounded surfaces in shape while natural world objects are often multi-scale. If indeed we shape our environment in a way that it conforms to our modes of perception, the smooth geometry of human constructions is a consequence of the structure of our minds and not a reflection of natural world statistics.

(b) Explanation always gets precedence over representation. As a result, representation can be surprisingly hard, even when it should be easy. This follows from the argument in section 3, for the non-inferential nature of mental processes. After why should the mental ignore a robust statistic in favor of other less non-accidental features? The only plausible reason is that the mental processes are highly selective in the features they explain, and the features that end up being explained are the ones selected by an indexed frame. As soon as primacy is ceded to an internal frame, the mental process is driven by the constraints of the frame, not of the world. This is also borne out by the examples in figure 1. The “world” non-accidental features lead to the inference that the objects in each pair are the same, yet the vertical frame imposed from within prevents the “ correct” percept from being formed.

To summarize, representation is secondary, and only acts as an interface between mental processes and world processes.Consequently, an exact match between the world and the perceiver is dependent on mental process selecting non-accidental features in a way that reflects the statistics of the world, i.e., the structure lattice. However, the structure and preference lattices are often not isomorphic. Nevertheless, perception is a highly robust process because it is always an explanation of non-accidental features even if the importance given to a non-accidental feature does not reflect the statistics of the world itself. We can capture the relationship between the between the perceiver and the world by the following diagram.


Figure 4

5. Conclusion. In this essay, I have argued that the basic unit of the perceiver-world system is not a static representation, but a dynamic, quasi-modular process. At the heart of a quasi-modular process lies a strong dynamical procedure that is strongly tied to non-accidental features of the accompanying spatio-temporal frame. Furthermore, the architecture of world processes is the same as that of mental processes. As a result, the perceiver-world distinction becomes a taxonomic one without any metaphysical implications. The dynamical approach allows us to understand the mental systems and the world for what they are- richly endowed structures that are inherently computational. One consequence of the richness of the perceiver and the world is that true representation is no longer a necessity, making representation a secondary goal for the mental system as well as decreasing its importance as an object of study for the natural scientist. I believe this opens the way for a theoretical psychology that is truly biological and that illustrates the central role that computation plays in biology.

1 A well known example of different explanatory levels is David Marr’s three levels of analysis, namely, computation, algorithms and implementation It is not clear that the correct levels are the ones that Marr talked about. In fact in this paper, we argue for a collapsing of the computational and algorithmic levels into one level, that we call “natural structure”. However, there has to be some division of the various structures into levels.

2Traditionally, this question has been posed in the context of the nature-nurture debate. Interestingly enough, both nativists as well as empiricists agree that the answer to the question is “No”, though they have different reasons. The modern empiricist (Churchland-Sejnowski, Crick) thinks that the problem of horizontal unification is a subset of the problem of vertical unification and that all answers will be couched in biological terms. The nativist answer (Fodor) is based on the modularity of the mind, which assumes that each has its own proprietary computations. In our opinion, the nativist as well as the empiricist positions are mistaken. Instead, we argue for a strong form of horizontal unification. In this approach, the mind and the world are highly structured entities and information/computation is central to the study of the mind-world relationship.


3 The term scale is not to be confused with actual physical scale, but as an abstract variable. For example, in the cognitive domain, information processing is a scale. That is to say there is a set of models for whom all the variables, generative processes and laws are computational.

4 There are many people who believe that at the computational level, the world and the organism are pretty simple, e.g., minimum description length (MDL) principles are implicitly based on this assumption.

5 There is a clear link between the strength of the computational processes and their metaphysical status. A convincing argument can be made that weak computational processes are by products of more fundamental biological processes. On the other hand, if the computational principles are quite strong, they cannot be explained away so easily, especially when there is no clear correspondence between current neural principles and strong computational principles. The existence of strong computational processes is a vindication of non-reductionism, either environmental or neural.

6 Spatio-temporal does not mean space-time as studied by physicists but rather any structure that supports spatial and causal processes. In particular, space-time is a spatio-temporal framework that is useful in the study of physics.

7 That is to say, no genuine distinction is implied in making this separation.

8 Only those features that are made explicit in a given frame are allowed to enter into relationships, whether they be generic or non-accidental.

9 Note that we need only 10 QM processes with 2 switches each to take care of 2 10 = 1024 alternatives. However, having multiple switches per quasi-module works only if three is an efficient way of turning on the right switch at the right time. Some recent work shows that this is feasible in an interacting network that is only slightly non-modular, see for example, Kasturirangan, R., Multiple Scales in Small World Networks, MIT AI Memo.

10 Kasturirangan, R., Multiple Scales in Small World Networks, MIT AI Memo.

11 Historically, for this reason, the explanatory leap on Newton’s part led to all sorts of controversies over action at a distance.


12 This is a situation that is normal in scientific research. There is no point in making strong theoretical claims in the absence of replicable data, because the data could turn out to be completely wrong. On the other hand, if the empirical data is replicable, it is better to use research time incorporating the available data in a strong theoretical framework than to spend it gathering more data.


13 See for example, General Pattern Theory by U.Grenander for one attempt at unification of statistical and geometric ideas.

14 A major hurdle has been the reluctance on the part of theorists to believe that the mind-world problem lies squarely within the scope of the natural sciences. Quite possibly, there are new principles to be discovered here that are as counter-intuitive as principles in any other science. The mind has always struck people as an object that they have direct access to as sentient beings. We need to drop that assumption and treat the computational/mental domain as an aspect of the natural world that needs to be studied with the same level of skepticism and rationality as the other sciences. When these criteria are applied, notions like the Turing test, general intelligence or neural implementation become very questionable as benchmarks for the study of the mind. However, there is no reason why we should not see deep regularities in computational systems as we have seen in all other aspects of the natural world.

15 There is an obvious objection to this claim, i.e., that it is consumer demand for the end product -cars- that drives design and production. That is true, however, it is also a misrepresentation of my argument. First of all, both the design and construction processes are largely independent of consumer demand. Secondly, consumers get to choose between an array of finished products. They do not dictate how cars are designed, which is mostly a function of the laws of physics and other constraints coming from their use. This is precisely a case where a key feature (consumer demand) triggers but does not cause a process to be set into motion.

16 I am using the word projection metaphorically. All that is meant is that the information that is available to the perceiver is not the same as the world itself.

17 This fact has been borne out repeatedly and is reflected both in the statistics of the world and the statisitcs of images. See Mandelbrot, Mumford, Gilden etc. Note that the lack of multi-scale object representation is not a question of mental acuity. Natural objects are multi-scale even when a high frequency cut-off is imposed. Furthermore, texture is represented at multiple scales. I wonder if texture is exactly that part of the world that is represented at many scales.