|
Digital Processing of Quantum-Limited
Images
Conjecture on the Relationship
between Spatial and Temporal Visual Processes
Why do Stabilized Images
Disappear?
A Simple Model for Filling-In,
Contrast, Contrast Constancy and Assimilation
What is “True Color”?
|
|
by Tom N. Cornsweet Abstract - The popular idea that
the disappearance of stabilized images is a consequence of local
adaptation-like processes leads to incorrect predictions about the
appearance of simple targets. An alternative, that signals pass
through a gate or valve that opens only during changes in local
illuminance, is presented.
During normal viewing, the eye movements that continuously
occur produce continuous motion of the retinal image with respect
to the retina. Various techniques have been developed to prevent
this motion and to produce what is called a "stabilized"
retinal image. Targets whose retinal images are stabilized rapidly
disappear (1,2).
Motion of the image across the retina produces changes
in the illumination of receptors wherever the image is non-uniform.
Further, flickering a stabilized image will prevent its disappearance
(3). Therefore, it seems reasonable to attribute disappearance to
a process that mimics capacitance coupling. That is, a processing
stage is hypothesized whose output is related to the rate of change
of its input at each point. A slightly different but closely related
mechanism hypothesized to explain disappearance is that of dark
and light adaptation. The brighter regions of the retina under a
stabilized image light adapt, that is, become less sensitive while
the darker regions dark adapt, that is, become more sensitive. Under
this hypothesis, this process proceeds until the differences in
outputs among regions of differing illuminance drop below threshold
and the pattern disappears. This general view seems to be held almost
universally. In the following, I will argue that this idea cannot
be correct.
Suppose you are presented with a stabilized image
of the white square in Figure 1a. The square will rapidly fade and
disappear. That is, the computer screen will look as though the
black square isn't there. Now if the image of the square is deliberately
shifted across your retina by a very small amount, the entire square
will reappear and look relatively uniform even though only small
regions at its edges experienced changes in illumination. This is,
of course, what happens in normal vision; if the image is not stabilized
and you fixate its center as steadily as you can, your eye movements
continuously shift the image through distances that are small compared
with the width of the target, and yet the target appears uniform.
(While trying to fixate steadily, eye movements have a roughly normal
distribution with a standard deviation of about 7 minutes of arc
(4). If your eyes are ten inches from a 15 inch (diagonal) screen,
the square in Figure 1 subtends an angle of about 10 degrees.)

To put it another way, receptors under the image
of the center of the square receive no information about whether
or not the square is stabilized. Nevertheless, the appearance of
the center of the square changes drastically depending upon whether
it is stabilized or not. Somehow, the appearance of the center depends
exclusively on processes that occur at the edges. (The hue of a
target similarly depends only on its edges (5)).
The process or processes that account for this phenomenon
are an example of processes called "filling-in". There
has been much philosophical debate about whether or not an explanation
of filling-in requires a physiological process that actually fills
in, but that issue will not be addressed here. What I will show,
instead, is that if the early processing were to pass only signals
proportional to the rate of change of input at each point, or if
light and dark adaptation were sufficient to explain the disappearance
of stabilized images, then the subsequent stages of the visual system
would not have the information required for filling-in and normal
seeing.
Figure 2 shows the change in local retinal illumination
that occurs when an otherwise stabilized image is suddenly shifted
a small amount to the right and to the left, as might occur during
normal eye movements. Note that the change in local illuminance
is identical for a dark disk moving to the right and a light disk
moving to the left. So if the early stages in the visual system
did indeed pass signals related only to the amount of change in
local illumination, the two disks would have to be indistinguishable.
In general, it is always true that an edge from dark to light moving
in one direction will produce changes in local illumination that
are identical with an edge from light to dark moving in the opposite
direction. Since targets containing edges are in fact distinguishable,
we must conclude that the signals emerging from the early processing
stages cannot contain only information related to the amount of
change of local illumination.

In principle, it is possible that signals such as
those in Figure 2 could be combined with other signals indicating
the direction and velocity of eye movements to generate the appropriate
perception. For example, signals fed back from eye muscles or from
eye movement control centers, or signals derived from retinal motion
detectors, could be used to resolve the ambiguities in Figure 2.
However, an alternative model will be presented here that, when
combined with the filling-in mechanism described in the paper that
accompanies this one (6), seems to offer a far simpler explanation.
(That doesn't mean that the simpler one is correct, or course.)
It seems likely from a wide variety of evidence that
a form of spatial band-pass filtering occurs early in the visual
process. This might be a consequence of lateral inhibition, or Intensity-Dependent
Spread (7), and is demonstrated psychophysically by the fact that
the spatial contrast sensitivity function has a band-pass shape.
In particular, sensitivity approaches zero as spatial frequency
approaches zero. Such a process causes all uniform regions in the
input that have the same level in the output. Only information related
to gradients, e.g. edges, appears at the output. Thus when the input
is the square in Figure 1a, the output has a profile as is plotted
in Figure 1b.
This process seems entirely consistent with the observation
noted above that the appearance of a target seems to depend only
on actions at its edges.
If the band-pass stage were followed by a stage whose
output depended only on the amount or rate of change of its input,
once again a dark-to-light edge moving in one direction would generate
an output that is identical to that for a light-to-dark edge moving
in the opposite direction.
Suppose, instead, that the output of each element
in the band-pass stage is input to a corresponding element in a
stage that acts as a gate or valve with the following action. If
there is no change in input, the gate is closed. If the input changes,
the gate opens (passes the input signal on) for a brief period and
then closes again. With that arrangement, stabilized images would
disappear but eye movements would allow signals to pass through
to the next, possibly filling-in, stage.
A neural circuit that would perform such a function
is diagrammed in Figure 3. Although this neural circuit is fairly
simple in structure, its action is quite complicated to explain.
It is presented not to imply that it is in fact the circuit to be
found in the nervous system, but only to illustrate that the gating
of neural signals can be performed without invoking more complex
mechanisms than excitation and inhibition. Its action is as follows.

The first layer represents photoreceptors feeding
into a lateral inhibition stage. The output of the central receptor
excites cell #1, which in turn excites cell #2 and inhibits cell
#3. The cells labeled "R" are simply a way of representing
the idea that cells #2 and #3 have non-zero resting levels of activity
when there is no input from cell #1. The arrangement in the second
layer, then, generates center-on and center-off receptive fields.
(The filling-in models of Grossberg and his coworkers require inputs
from both types of receptive fields (8), and in my paper in this
web page titled " A Simple Model for Filling-in, Contrast,
Contrast Constancy and Assimilation", I present an alternative
filling-in model that also requires both. It may be that filling-in
cannot be accomplished unless both center-on and center-off receptive
fields exist.) (Speaking of Grossberg, it is worth noting that his
models assume an input that mimics the retinal light distribution.
He does not address the question of why stabilized images disappear.)
The next stage in the model in Figure 3 performs
a gating action, as follows. Consider the left side first. Signals
from cell #2 excite cell #7. They also follow two paths to cell
#6, a normal path that is inhibitory and a slow path that is excitatory.
(The "slow" pathway might consist of a longer and /or
smaller fiber, one with one or more synapses interposed, or a slower
synaptic transmission mechanism.) When the output of cell #2 is
unchanging, the excitatory and inhibitory inputs from cell #2 to
cell #6 are equal and the output of cell #6 will be its resting
level (from cell R). The output of cell #6 inhibits cell #7 with
a high gain. That is, even a small output from cell #6 will strongly
inhibit cell #7, driving its output to zero regardless of the level
of excitation from cell #2.
Now suppose the illumination falling on the central
photoreceptor increases. This will increase the output from cell
#2, which, at first, will inhibit cell #6, thus disinhibiting cell
#7 and allowing the output from cell #2 to appear at the output
of cell #7. This in turn will cause an increase in the output of
the output cell, #10. However, after a delay due to the slowness
of the excitatory path from cell #2 to cell #6, the inputs to #6
will be balanced again and its resting level will again block cell
#7.
If the illumination on the central photoreceptor
were steady and then decreased, the output from cell #6 would momentarily
increase, maintaining the blocking of cell #7.
The circuit on the right in this layer is identical
with that on the left. If the illumination increases, the output
of cell #3 will decrease, causing a momentary increase in the output
of cell #9, thus maintaining the blocking of cell #8. However, if
the illumination decreases, then the output of cell #3 will increase,
causing a momentary release of the blocking of cell #8, and the
decrease will be signaled as an increase in the output from cell
#8 and, in turn, a decrease in cell #10.
The last stage in the model in Figure 4 represents
the filling-in mechanism described in the accompanying paper on
filling-in (6).

It should be noted that the argument and model above
do not imply that local adaptation-like processes are absent. For
example, differential photochemical adaptation clearly occurs under
a stabilized image. The present argument is simply that this kind
of process is insufficient to explain the disappearance of stabilized
images.
Although local adaptation-like processes certainly
act to reduce the apparent contrast of stabilized images, the assumption
that such processes cause the disappearance of stabilized images
leads to incorrect predictions about the appearances of objects
under normal viewing. The assumption of a gating process permits
correct predictions.
1) Ditchburn, R.W. and Ginsborg, B.L. Vision with
a stabilized retinal image. Nature, 170, 36-37 (1952).
2) Riggs, L.A., Ratliff, F, Cornsweet, J.C. and Cornsweet,
T.N. The disappearance of steadily fixated test objects. J.Opt.
Soc. Amer, 43, 495 (1953).
3) Cornsweet, T.N. Determination of the stimuli for
involuntary drifts and saccadic eye movements. J. Opt. Soc. Amer.
46, 987-993 (1956)
4) Nachmias, J. Two-dimensional motion of the retinal
image during monocular fixation. J. Opt. Soc. Amer., 49, 901-908
(1959)
5) Krauskopf, J. Effect of retinal image stabilization
on the appearance of heterochromatic targets. J. Opt. Soc. Amer.
53, 741-744 (1963)
6) Cornsweet, T.N. A simple model for filling-in,
contrast constancy and assimilation. Vispath.com web pages (2001)
7) Cornsweet, T.N. and Yellott, J.I. Intensity-dependent
spatial summation. J. Opt. Soc. Amer. A. 2, 1769-1786 (1985).
8) Grossberg, S. and Todorovic, D. Neural dynamics
of 1-D and 2-D brightness perception: A unified model of classical
and recent phenomena. Perception and Psychophysics, 43, 241-277
(1988).
|