Digital Processing of Quantum-Limited Images

Conjecture on the Relationship between Spatial and Temporal Visual Processes

Why do Stabilized Images Disappear?

A Simple Model for Filling-In, Contrast, Contrast Constancy and Assimilation

What is “True Color”?

 

 

Why do Stabilized Images Disappear?

by Tom N. Cornsweet Abstract - The popular idea that the disappearance of stabilized images is a consequence of local adaptation-like processes leads to incorrect predictions about the appearance of simple targets. An alternative, that signals pass through a gate or valve that opens only during changes in local illuminance, is presented.

Background

During normal viewing, the eye movements that continuously occur produce continuous motion of the retinal image with respect to the retina. Various techniques have been developed to prevent this motion and to produce what is called a "stabilized" retinal image. Targets whose retinal images are stabilized rapidly disappear (1,2).

Motion of the image across the retina produces changes in the illumination of receptors wherever the image is non-uniform. Further, flickering a stabilized image will prevent its disappearance (3). Therefore, it seems reasonable to attribute disappearance to a process that mimics capacitance coupling. That is, a processing stage is hypothesized whose output is related to the rate of change of its input at each point. A slightly different but closely related mechanism hypothesized to explain disappearance is that of dark and light adaptation. The brighter regions of the retina under a stabilized image light adapt, that is, become less sensitive while the darker regions dark adapt, that is, become more sensitive. Under this hypothesis, this process proceeds until the differences in outputs among regions of differing illuminance drop below threshold and the pattern disappears. This general view seems to be held almost universally. In the following, I will argue that this idea cannot be correct.

Suppose you are presented with a stabilized image of the white square in Figure 1a. The square will rapidly fade and disappear. That is, the computer screen will look as though the black square isn't there. Now if the image of the square is deliberately shifted across your retina by a very small amount, the entire square will reappear and look relatively uniform even though only small regions at its edges experienced changes in illumination. This is, of course, what happens in normal vision; if the image is not stabilized and you fixate its center as steadily as you can, your eye movements continuously shift the image through distances that are small compared with the width of the target, and yet the target appears uniform. (While trying to fixate steadily, eye movements have a roughly normal distribution with a standard deviation of about 7 minutes of arc (4). If your eyes are ten inches from a 15 inch (diagonal) screen, the square in Figure 1 subtends an angle of about 10 degrees.)


Figure 1

To put it another way, receptors under the image of the center of the square receive no information about whether or not the square is stabilized. Nevertheless, the appearance of the center of the square changes drastically depending upon whether it is stabilized or not. Somehow, the appearance of the center depends exclusively on processes that occur at the edges. (The hue of a target similarly depends only on its edges (5)).

The process or processes that account for this phenomenon are an example of processes called "filling-in". There has been much philosophical debate about whether or not an explanation of filling-in requires a physiological process that actually fills in, but that issue will not be addressed here. What I will show, instead, is that if the early processing were to pass only signals proportional to the rate of change of input at each point, or if light and dark adaptation were sufficient to explain the disappearance of stabilized images, then the subsequent stages of the visual system would not have the information required for filling-in and normal seeing.

The Argument

Figure 2 shows the change in local retinal illumination that occurs when an otherwise stabilized image is suddenly shifted a small amount to the right and to the left, as might occur during normal eye movements. Note that the change in local illuminance is identical for a dark disk moving to the right and a light disk moving to the left. So if the early stages in the visual system did indeed pass signals related only to the amount of change in local illumination, the two disks would have to be indistinguishable. In general, it is always true that an edge from dark to light moving in one direction will produce changes in local illumination that are identical with an edge from light to dark moving in the opposite direction. Since targets containing edges are in fact distinguishable, we must conclude that the signals emerging from the early processing stages cannot contain only information related to the amount of change of local illumination.


Figure 2

In principle, it is possible that signals such as those in Figure 2 could be combined with other signals indicating the direction and velocity of eye movements to generate the appropriate perception. For example, signals fed back from eye muscles or from eye movement control centers, or signals derived from retinal motion detectors, could be used to resolve the ambiguities in Figure 2. However, an alternative model will be presented here that, when combined with the filling-in mechanism described in the paper that accompanies this one (6), seems to offer a far simpler explanation. (That doesn't mean that the simpler one is correct, or course.)

An Alternative Model

It seems likely from a wide variety of evidence that a form of spatial band-pass filtering occurs early in the visual process. This might be a consequence of lateral inhibition, or Intensity-Dependent Spread (7), and is demonstrated psychophysically by the fact that the spatial contrast sensitivity function has a band-pass shape. In particular, sensitivity approaches zero as spatial frequency approaches zero. Such a process causes all uniform regions in the input that have the same level in the output. Only information related to gradients, e.g. edges, appears at the output. Thus when the input is the square in Figure 1a, the output has a profile as is plotted in Figure 1b.

This process seems entirely consistent with the observation noted above that the appearance of a target seems to depend only on actions at its edges.

If the band-pass stage were followed by a stage whose output depended only on the amount or rate of change of its input, once again a dark-to-light edge moving in one direction would generate an output that is identical to that for a light-to-dark edge moving in the opposite direction.

Suppose, instead, that the output of each element in the band-pass stage is input to a corresponding element in a stage that acts as a gate or valve with the following action. If there is no change in input, the gate is closed. If the input changes, the gate opens (passes the input signal on) for a brief period and then closes again. With that arrangement, stabilized images would disappear but eye movements would allow signals to pass through to the next, possibly filling-in, stage.

A neural circuit that would perform such a function is diagrammed in Figure 3. Although this neural circuit is fairly simple in structure, its action is quite complicated to explain. It is presented not to imply that it is in fact the circuit to be found in the nervous system, but only to illustrate that the gating of neural signals can be performed without invoking more complex mechanisms than excitation and inhibition. Its action is as follows.


Figure 3

The first layer represents photoreceptors feeding into a lateral inhibition stage. The output of the central receptor excites cell #1, which in turn excites cell #2 and inhibits cell #3. The cells labeled "R" are simply a way of representing the idea that cells #2 and #3 have non-zero resting levels of activity when there is no input from cell #1. The arrangement in the second layer, then, generates center-on and center-off receptive fields. (The filling-in models of Grossberg and his coworkers require inputs from both types of receptive fields (8), and in my paper in this web page titled " A Simple Model for Filling-in, Contrast, Contrast Constancy and Assimilation", I present an alternative filling-in model that also requires both. It may be that filling-in cannot be accomplished unless both center-on and center-off receptive fields exist.) (Speaking of Grossberg, it is worth noting that his models assume an input that mimics the retinal light distribution. He does not address the question of why stabilized images disappear.)

The next stage in the model in Figure 3 performs a gating action, as follows. Consider the left side first. Signals from cell #2 excite cell #7. They also follow two paths to cell #6, a normal path that is inhibitory and a slow path that is excitatory. (The "slow" pathway might consist of a longer and /or smaller fiber, one with one or more synapses interposed, or a slower synaptic transmission mechanism.) When the output of cell #2 is unchanging, the excitatory and inhibitory inputs from cell #2 to cell #6 are equal and the output of cell #6 will be its resting level (from cell R). The output of cell #6 inhibits cell #7 with a high gain. That is, even a small output from cell #6 will strongly inhibit cell #7, driving its output to zero regardless of the level of excitation from cell #2.

Now suppose the illumination falling on the central photoreceptor increases. This will increase the output from cell #2, which, at first, will inhibit cell #6, thus disinhibiting cell #7 and allowing the output from cell #2 to appear at the output of cell #7. This in turn will cause an increase in the output of the output cell, #10. However, after a delay due to the slowness of the excitatory path from cell #2 to cell #6, the inputs to #6 will be balanced again and its resting level will again block cell #7.

If the illumination on the central photoreceptor were steady and then decreased, the output from cell #6 would momentarily increase, maintaining the blocking of cell #7.

The circuit on the right in this layer is identical with that on the left. If the illumination increases, the output of cell #3 will decrease, causing a momentary increase in the output of cell #9, thus maintaining the blocking of cell #8. However, if the illumination decreases, then the output of cell #3 will increase, causing a momentary release of the blocking of cell #8, and the decrease will be signaled as an increase in the output from cell #8 and, in turn, a decrease in cell #10.

The last stage in the model in Figure 4 represents the filling-in mechanism described in the accompanying paper on filling-in (6).


Figure 4

It should be noted that the argument and model above do not imply that local adaptation-like processes are absent. For example, differential photochemical adaptation clearly occurs under a stabilized image. The present argument is simply that this kind of process is insufficient to explain the disappearance of stabilized images.

Conclusions

Although local adaptation-like processes certainly act to reduce the apparent contrast of stabilized images, the assumption that such processes cause the disappearance of stabilized images leads to incorrect predictions about the appearances of objects under normal viewing. The assumption of a gating process permits correct predictions.

References

1) Ditchburn, R.W. and Ginsborg, B.L. Vision with a stabilized retinal image. Nature, 170, 36-37 (1952).

2) Riggs, L.A., Ratliff, F, Cornsweet, J.C. and Cornsweet, T.N. The disappearance of steadily fixated test objects. J.Opt. Soc. Amer, 43, 495 (1953).

3) Cornsweet, T.N. Determination of the stimuli for involuntary drifts and saccadic eye movements. J. Opt. Soc. Amer. 46, 987-993 (1956)

4) Nachmias, J. Two-dimensional motion of the retinal image during monocular fixation. J. Opt. Soc. Amer., 49, 901-908 (1959)

5) Krauskopf, J. Effect of retinal image stabilization on the appearance of heterochromatic targets. J. Opt. Soc. Amer. 53, 741-744 (1963)

6) Cornsweet, T.N. A simple model for filling-in, contrast constancy and assimilation. Vispath.com web pages (2001)

7) Cornsweet, T.N. and Yellott, J.I. Intensity-dependent spatial summation. J. Opt. Soc. Amer. A. 2, 1769-1786 (1985).

8) Grossberg, S. and Todorovic, D. Neural dynamics of 1-D and 2-D brightness perception: A unified model of classical and recent phenomena. Perception and Psychophysics, 43, 241-277 (1988).