 |
| Volume 1, Number 2, Article 4, Pages 99-111 |
doi:10.1167/1.2.4 |
http://journalofvision.org/1/2/4/ |
ISSN 1534-7362 |
Local and global visual grouping: Tuning for spatial frequency and contrast
Steven C. Dakin |
Institute of Ophthalmology, University College London, London, United Kingdom
Institute of Ophthalmology, University College London, London, United Kingdom |
|
Peter J. Bex |
Institute of Ophthalmology, University College London, London, United Kingdom |
|
Abstract
Glass patterns are visual textures composed of a field of dot pairs (dipoles) whose orientations are determined by a simple geometrical transformation, such as a rotation. Detection of structure in these patterns requires the observer to perform local grouping (to find dipoles) and global grouping to combine their orientations into a percept of overall shape. We estimated the spatial frequency tuning of these grouping processes by measuring signal-to-noise detection thresholds for Glass patterns composed of spatially narrow-band elements. Local tuning was probed by varying the spatial frequency difference between the two elements comprising each dipole. Global tuning was estimated using dipoles containing one spatial frequency and then estimating masking as a function of the spatial frequency of randomly positioned noise elements. We report that the tuning of local grouping is band-pass (ie, it is responsive to a narrow range of spatial frequencies), but that tuning of global grouping is broad and low-pass (ie, it integrates across a broader range of lower spatial frequencies). Control experiments examined how the contrast and visibility of elements might contribute to these findings. Local grouping proved to be more resistant to local contrast variation than global grouping. We conclude that local grouping is consistent with the use of simple-oriented filtering mechanisms. Global grouping seems to depend more on the visibility of elements that can be affected by both spatial frequency and contrast.
 |
|
History
Received June 18, 2001; published November 16, 2001
Citation
Dakin, S. C. & Bex, P. J. (2001). Local and global visual grouping: Tuning for spatial frequency and contrast.
Journal of Vision, 1(2):4, 99-111,
http://journalofvision.org/1/2/4/,
doi:10.1167/1.2.4.
Keywords
texture, form, Glass patterns, spatial frequency
for related articles by these authors
for papers that cite this paper |
Visual grouping refers to the process of revealing structure in images
by selectively associating local features with one another. It serves a computational
role in reducing the redundancy of our descriptions of the world (Watt,
1988). For example, if one encounters a swarm of bees, it is computationally
more efficient to compute ones position relative to a (single) cloud of insects,
than to first estimate one's position relative to each bee, and then average these
distances. The latter offers no functional advantage over the former, assuming
one's goal is simply to avoid the collective.
Over the last 30 years, Glass patterns (Glass, 1969)
have been used extensively to probe grouping mechanisms in human vision. These
patterns were originally generated by splattering paint over a silk screen and
then making a composite image of the resulting random-dot pattern and a transformed
(eg, rotated) version of it. Although the technique used to generate these patterns
is now different, the impression gained from inspecting them is similar: compelling
orientation structure corresponding to the generative transformation (eg, rotation
in Figure 1a). Glass patterns have remained of theoretical
interest because our ability to see structure in them indicates that we are grouping
members of the same dipole, and then combining those local groupings into a global
impression of overall (eg, circular) structure. These two types of associations
are referred to as local and global grouping, respectively.
The local grouping processes underlying Glass patterns have been the focus of
a number of previous studies. For high-density patterns, it is difficult to group
dipole members together simply because each dot/element will tend to have a large
number of elements closer to it than its dipole correspondent (Stevens, 1978). A variety of psychophysical data support
the idea that local structure is being derived not by specialized "token" matchers
(Stevens, 1978 ; Stevens & Brookes, 1978; Marr,
1982), but from the output of linear spatial filters (Zucker,
1985; Prazdny, 1986 ; Dakin,
1997a,b; Dakin, 1999).
The simplest demonstration of this is that our ability to see veridical structure
in these patterns is dependent on dipole elements being the same contrast polarity
(Figure 1). Given that any positional tokens are unaltered
between Figures 1a and 1b, our inability to see circular structure in Figure 1b
is likely to be because a pair of opposite contrast-polarity features do not collectively
stimulate the same subregion of a filter.
 |
Figure
1. A rotational Glass pattern formed from spatially narrow-band, isotropic
Laplacian-of-Gaussian elements (a). The same pattern where one element from
each dot-pair has been contrast reversed (b); the perceived rotational structure
is generally reported as weaker. |
 |
Furthermore, filtering mechanisms predict that local anti-correlation of luminance
structure, introduced by contrast-polarity inversion, will introduce perceptual
structure orthogonal to the true transformation (Dakin, 1997b). This is consistent with observers' reports
of the presence of a "petal-like" radial structure in these patterns.
There is indirect evidence that the filtering operations underlying local grouping
are tuned to a narrow range of spatial frequencies. Oriented structure in Glass
patterns (composed of dots) is contained within a relatively narrow range of spatial
frequencies, so that a broadly spatially tuned filter would be swamped by noise
from adjacent frequency bands (Dakin, 1997a). Indeed,
observers' precision at judging the orientation of translational Glass patterns
is consistent with these local filtering operations being selective for both local
orientation and local spatial frequency (Dakin, 1997a).
A smaller amount of research has examined how local orientation estimates are
combined in Glass patterns to form the global percept of structure. Wilson and
coworkers (Wilson, Wilkinson, & Asaad, 1997; Wilson
& Wilkinson, 1998) have reported that a subject's ability to see structure
in high-density Glass patterns depends to a great extent on the type of global
organization. Specifically, they found that signal-to-noise detection thresholds
are lowest for circular, and highest for 90° translational, Glass patterns.
These authors interpret their findings as evidence for a contribution to the detection
of rotational structure from cells in cortical area V4 that have been shown, in
the macaque, to be sensitive to circular structure (Gallant, Braun, & Van Essen, 1993). Poor performance
with translational patterns is attributed to a lack of global integrators for
translational structure, so that subjects have to rely on local grouping mechanisms,
which integrate over smaller regions of space. Recently, however, we questioned
the generality of the results by Wilson et al by demonstrating
that this "circular advantage" seems to be at least partially contingent on the
stimulus window being round (Dakin & Bex, in press).
We have suggested that the "rotational advantage" could be attributable to the
presence of edge artefacts caused by the presence of unmatched elements at the
edge of translational, but not rotational, patterns. Contrary to Wilson et al,
we also reported broadly similar integration performance for rotations and translations,
the latter of which are supposedly subserved by grouping mechanisms operating
over a more limited locale. Equal performance of the majority of our subjects
at detecting rotational and translational structure does not serve to delineate
the operation of local and global grouping mechanisms.
Spatial frequency tuning for texture segmentation is known to be band-pass (Kingdom & Keeble, 2000), but no previous studies
have examined spatial frequency tuning of global grouping processes in Glass patterns.
However, because it seems reasonable to suppose that the perception of structure
in Glass patterns involves the detection of extended contourlike structure, evidence
that pertains to the grouping processes underlying contour detection may be relevant.
The paradigm for examining contour detection developed by
Field, Hayes, and Hess (1993) involves the detection of a string of discrete
oriented patches, whose orientations and positions are consistent with the presence
of a contour, embedded in a field of randomly oriented distractor elements. Using
this task, it has been established that the global grouping mechanism responsible
for contour linking is tuned for local orientation (Field
et al, 1993), but not for the local contrast (Hess, Dakin,
& Field, 1998) and only weakly for the local phase of elements
(Field, Hayes, & Hess, 2000). Dakin and Hess, (1998)
estimated the spatial-frequency tuning of the contour linking process by measuring
the disruptive effect of switching between two spatial frequencies along alternate
elements of the path. This study showed contour linking to be spatially band-pass
in its sensitivity with the bandwidth showing an inverse dependence on the curvature
of the contour. Detection of straight contours is less sensitive to local spatial
frequency variation than the detection of curved contours.
The purpose of this paper is to estimate the spatial frequency tuning of local
and global grouping processes in the perception of Glass pattern structure.
Equipment
Stimuli were generated on an Apple Macintosh G3 computer, fitted with a Mac Picasso
850 graphics card (VillageTronic Ltd, Hanover, Germany), and presented on a 19-inch
Sony Multiscan 400PS colour monitor. The screen had a resolution of 1280 X 1024
pixels and the vertical blanking rate was 85 Hz. Stimuli were displayed with pseudo
12-bit contrast accuracy (ie, 256 grey levels could be displayed from a possible
range of 4096), which was achieved by electronically combining the RGB outputs
from the graphics card using a video attenuator (Pelli and
Zhang, 1991). A monochrome signal was generated by amplifying and sending
the same attenuated signal to all three guns. The output luminance was linearized
using a look-up table. The programs for running the experiment were written in
the Matlab environment (MathWorks Ltd., Natick, MA) using code from the Psychophysics
Toolbox (Brainard, 1997) and the Videotoolbox (Pelli,
1997) packages. The screen was viewed binocularly at a distance of 147 cm,
so that 1 pixel on the screen subtended 0.57 arcmin2. The display had
a background luminance of 48 d/m2.
Subjects
The authors served as subjects in the experiments. Both are experienced psychophysical
subjects with considerable experience at this and similar tasks. S.C.D. is a corrected
myope.
 |
Figure
2. Examples of the stimuli used. Rotational Glass pattern containing 100%
(a), 50% (b), and 25% (c) signal dots; the remainder of elements have been
randomly positioned. Subjects perform a discrimination between structured
patterns, such as "a," and random patterns, such as "d,"
to determine the minimum proportion of structured dipoles that supports
discrimination. Experiments were performed with three global organizations:
rotations (a), 90° translations (e), and expansions (f). Note that
these global transformations are used to determine only the orientation
of dipoles. Dipole length is constant throughout the pattern (whereas a
true rotation, for example, would lead to elements being closer to one another
at the stimulus center).
|
 |
Stimuli
Stimuli were 512 pixel (24.0 degrees) square images containing a texture composed
of a mixture of element-pairs and randomly positioned elements. All elements
were two-dimensional Laplacian-of-Gaussians:
Elements were pregenerated, stored within a region of size ±4σ at floating-point
accuracy, and presented at 50% contrast. Overlaps were added, and values producing
overflow were clipped at the maximum displayable grey level. All Glass patterns
contained exactly 200 elements. Dipoles were constrained to fall in a circular
region with radius 10.0°. Elements falling outside the circular region were
not plotted. Three transformations were used to generate dipole orientations:
rotations, vertical translations, and expansions (examples of each are shown in
Figure 2a, 2e, and 2f, respectively). Note that the transformations
were used to generate only dipole orientation and not length; dipole elements
were separated (center-to-center) by a constant distance of 48 arcmin for all
pattern organizations.
Procedure
Subjects performed a two-interval, two-alternative forced-choice task. Two patterns
were presented sequentially, each for 145 milliseconds, separated by a 500-millisecond
interstimulus interval (ISI). One interval contained a Glass pattern, the other
a noise texture, and the subject indicated which interval contained the Glass
pattern. The independent variable was the proportion of correctly oriented dipoles
in the Glass pattern (the signal-to-noise ratio), where the remaining dots were
randomly positioned. Examples of various mixtures of signal and noise elements
are shown in Figure 2a-2c. The noise interval contained
a stimulus composed of randomly oriented dipoles (interspersed with the same proportion
of randomly oriented position elements; Figure 2d shows a pattern composed exclusively
of randomly oriented dipoles). QUEST (Watson & Pelli,
1983), an adaptive psychophysical method, sampled a range of signal-to-noise
ratios and attempted to converge on the ratio of signal-to-noise dots that elicited
83% correct performance. Runs consisted of blocks of 45 trials and at least three
runs were undertaken for each data point plotted. Runs were not interleaved; subjects
always knew for which organization they were looking. Data were pooled across
all runs performed with a particular stimulus configuration; error bars show the
estimated SE.
 |
Table
1. Stimulus parameters for the 13 interleaved conditions comprising each
experiment.
|
 |
| |
Local
Conditions |
Global
Conditions |
Control
Conditions |
| |
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
| Ns
|
200
|
200
|
200
|
200
|
200
|
100
|
100
|
100
|
100
|
100
|
100
|
100
|
100
|
| Nm
|
0
|
0
|
0
|
0
|
0
|
100
|
100
|
100
|
100
|
100
|
100
|
100
|
100
|
| Settings
for spatial-frequency varying experiments (Experiments 1 and 2; reference
sf = 2.0 c/deg) |
| Sfs1
|
2.0
|
2.0
|
2.0
|
2.0
|
2.0
|
2.0
|
2.0
|
2.0
|
2.0
|
1.0
|
1.4
|
2.8
|
4.0
|
| Sfs2
|
1.0
|
1.4
|
2.0
|
2.8
|
4.0
|
2.0
|
2.0
|
2.0
|
2.0
|
1.0
|
1.4
|
2.8
|
4.0
|
| Sfm
|
-
|
-
|
-
|
-
|
-
|
1.0
|
1.4
|
2.8
|
4.0
|
2.0
|
2.0
|
2.0
|
2.0
|
| Settings
for contrast varying experiments (Experiments 2 and 3; reference contrast
= 0.5) |
| Cs1
|
0.5
|
0.5
|
0.5
|
0.5
|
0.5
|
0.5
|
0.5
|
0.5
|
0.5
|
0.25
|
0.35
|
0.5
|
0.71
|
| Cs2
|
0.25
|
0.35
|
0.5
|
0.71
|
1.0
|
0.5
|
0.5
|
0.5
|
0.5
|
0.25
|
0.35
|
0.5
|
0.71
|
| Cm
|
-
|
-
|
-
|
-
|
-
|
0.25
|
0.35
|
0.71
|
1.0
|
0.5
|
0.5
|
0.5
|
0.5
|
 |
Table
Ns is the number of paired or cued dots in the stimulus
(ie, twice the number of dipoles), and Nm is the number of randomly
positioned singleton elements comprising the mask. Sfs1 and Sfs2
refer to the spatial frequencies (in c/deg) of the two components of each
dipole, and Sfm refers to the spatial frequency of the masking
pattern. Cs1, Cs2, and Cm refer to the
Michelson contrast of the two dipole components and the masking pattern,
respectively. Each experiment consisted of five local conditions, where
spatial frequency and/or contrast varied (around some reference value) within
a dipole, four global conditions, where spatial frequency/contrast was fixed
within a dipole but stimuli were added to a mask at various spatial frequency/contrasts,
and four control conditions, where various consistent dipole spatial frequency/contrast
combinations were tested in the presence of a mask at the reference contrast/spatial
frequency. This procedure forced subjects to attend to all spatial frequency/contrast
bands, any of which could contain the target or mask.
|
 |
 |
Figure
3. Examples of the stimuli from Experiment 1 local (a,b) and global (c,d)
conditions. Dipoles are composed of elements at 2.0 and 4.0 c/deg (a) and
2.0 and 1.0 c/deg (b). Dipoles are exclusively composed of 2.0 c/deg elements
and have been intermixed with randomly positioned masking dots at 4.0 c/deg
(c) and 1.0 c/deg (d).
|
 |
We attempted to separate the effects of spatial frequency and contrast variation
in three experiments. Experiment 1 examined spatial frequency with fixed Michelson
contrast (ie, variable root mean square [RMS] contrast/visibility). Experiment
2 examined spatial frequency with fixed RMS contrast/visibility (ie, variable
Michelson contrast), and Experiment 3 looked at the effects of Michelson/RMS contrast
for a fixed spatial frequency.
| Experiment
1. Spatial Frequency Tuning With Matched Michelson Contrast |
The first experiment examined the effect of spatial frequency variation on local
and global grouping with elements at a fixed Michelson contrast (C = 0.5). Each
session consisted of 13 interleaved runs, probing 5 local, 4 global, and 4 control
conditions (Table 1 summarizes relevant stimulus parameters).
In the local conditions (1-5), one (randomly selected) element of each dipole
was fixed at 2.0 c/deg, and the spatial frequency of the other was varied according
to condition from 1.0-4.0 c/deg, in half octave steps. Examples of stimuli from
the local condition are shown in Figure 3a and 3b. As the
signal-to-noise ratio was lowered, dipoles were replaced with randomly positioned
dots at the same spatial frequencies as the dipole elements. The threshold signal-to-noise
ratio was defined as the level supporting 83% discrimination from a noise pattern
composed of randomly oriented dipoles (with matched spatial frequency structure).
In the global conditions (6-9), dipole elements were always both fixed at 2.0
c/deg, but dipole elements were intermixed with a mask composed of the same number
of randomly positioned elements at a single, different spatial frequency (1.0,
1.4, 2.8, or 4.0 c/deg). Examples of stimuli from the global condition are shown
in Figure 3a and 3d. The signal-to-noise ratio of the dipole
population was then varied as in the local conditions. The control conditions
(10-13) were the converse of the global conditions; dipoles now contained a single
spatial frequency (1.0, 1.4, 2.8, or 4.0 c/deg) and were intermixed with a mask
composed of an equal number (ie, 2 X the number of dipoles) of randomly positioned
elements at 2.0 c/deg. Control conditions ensured that subjects could not perform
the task by attending only to 2.0 c/deg but instead had to distribute their attention
across spatial frequencies.
Results
Results from the local grouping condition, for the three global transformations
tested, are graphed in Figure 4a and 4b. Sensitivity (the
reciprocal of threshold) is plotted as a function of the spatial frequency interleaved
with the 2.0 c/deg element. Neither subject shows a consistent advantage for any
one transformation, but both show slightly poorer sensitivity to radial structure.
Both subjects are decreasingly sensitive to Glass pattern structure as the difference
between the spatial frequency of dipole elements increases. Because this task
encourages subjects to integrate over as wide a range of spatial frequencies as
possible, this pattern of band-pass sensitivity should reflect the spatial tuning
of the mechanism underlying detection of local structure in these patterns. Spatially
band-pass tuning is consistent with the notion that local grouping is performed
by oriented filtering mechanisms. This in turn is consistent with previous theoretical
(eg, Zucker, 1985), and psychophysical (Dakin, 1997a) observations, as well as the notion that filters are instantiated
by the receptive fields of V1 neurones, which are band-pass tuned for spatial
frequency.
 |
Figure
4. Spatial frequency tuning of local and global grouping for subjects P.J.B.
(a,c) and S.C.D. (b,d). (a,b) Local sensitivity (the reciprocal of the signal-to-noise
ratio at threshold) is plotted as a function of the spatial frequency of
the element paired with a 2 c/deg dipole element. In "a" and "b,"
data directly reflect the sensitivity of the underlying mechanism (because
the task requires subjects to integrate over as wide a range of spatial
frequencies as possible) so that the higher sensitivity at middle frequencies
indicates that the local grouping mechanism is band-pass tuned. (c,d) Global
sensitivity is plotted as a function of the spatial frequency of the masking
stimulus. Here, sensitivity inversely relates to the sensitivity of the
underlying mechanism (because the task requires subjects to operate over
as narrow a range of spatial frequencies as possible); ie, the observed
higher sensitivity at higher masking frequencies indicates that the global
grouping mechanism is low-pass tuned (which allows it to ignore high spatial
frequencies).
|
 |
 |
Figure
5.The effect of visual attention on global tuning. S.C.D. was required to
detect rotational structure in the presence of masking elements; however,
conditions were not interleaved so that the subject always knew which spatial
frequencies defined the target. Results (open circles) are similar to data
from Experiment 1 (filled circles; replotted from Figure 3).
|
 |
Results from the conditions probing global grouping are presented in Figure
4c and 4d. In contrast to the local condition, the global task discouraged
subjects from integrating over a wide range of scales. In order to discount
the presence of noise, subjects should attempt to utilize information only at
the spatial frequency of the dipole elements. Therefore, poor performance on
this task (ie, low sensitivity) at a particular spatial frequency indicates
higher sensitivity of the underlying mechanism to structure at that scale. Tuning
of the underlying mechanism will, therefore, be the inverse of the pattern of
tuning shown in Figure 4c and 4d, which demonstrates
that both subjects show lower sensitivity to structure when Glass patterns were
intermixed with noise elements at lower spatial frequencies. This means that
global grouping mechanisms are decreasingly able to ignore structure at decreasingly
lower frequencies (ie, they are spatially low-pass in their tuning).
A general point to note from Figure 4 is that, contrary
to Wilson et al (Wilson et al, 1997 ;
Wilson & Wilkinson, 1998), we observe no consistent advantage for
any one transformation over another. This seems likely to be due to the relatively
low density of our patterns, which do not support the type of edge cues that may
be responsible for the reported advantage in dense patterns (Dakin
& Bex, in press).
| Control
Experiment: Attentional Modulation of Global Tuning |
In Experiment 1, all conditions were interleaved to prevent subjects from attending
to structure within any one spatial frequency band. However, we were concerned
that the demands we placed on subjects, who were required to monitor a series
of spatial frequencies/contrasts simultaneously, may have influenced the tuning
observed. To test this we reran the global conditions from Experiment 1 (using
rotational patterns) but did not interleave them, so that the subject knew in
advance which spatial frequencies defined the target. Somewhat to our surprise,
results remained similar (Figure 5) with the observer showing clear low-pass tuning
for detection. There appears to be little influence of top-down factors on this
task.
| Experiments
2-3. Tuning for Spatial Frequency or Contrast? |
Manipulating local spatial frequency, in the manner described above, also affects
the visibility of elements. It is therefore possible that the observed low-pass
tuning for global grouping results from a simple inverse relationship between
the visibility of elements and their spatial frequency (although visibility clearly
cannot explain the local band-pass tuning result). Indeed, the high-pass elements
in Figure 3a and 3c do appear less conspicuous, and so might
be expected to have a less disruptive effect on detection of the target pattern.
We ran two experiments to examine this question. Experiment 2 employed a methodology
similar to the first experiment but equated the RMS contrast of all elements.
This amounts to lowering the Michelson contrast of the low-frequency elements,
and raising the Michelson contrast of the high-frequency elements. Experimental
parameters are given in Table 1 and examples of the stimuli
are shown in Figure 6. Notice that on casual inspection, elements at all
spatial frequencies now appear equally visible, and it is the case that RMS contrast
has been shown to be a good predictor of apparent contrast in two dimensional
noise patterns (Moulden, Kingdom, & Gatley, 1990). If it is either
the changes in RMS contrast or, to a reasonable approximation, the visibility
of the elements that determines the tuning we observed in Experiment 1, then we
should observe no spatial frequency tuning in Experiment 2.
If tuning is observed in both Experiments 1 and 2, then that would suggest that
it is the spatial frequency and not the contrast that determines the tuning observed
in Experiment 1. However, one cannot rule out the possibility that the system
is tuned for both contrast and spatial frequency without looking at the effect
of contrast with spatial frequency held constant. Experiment 3 measured this and
was analogous to Experiment 1 but employed changes in contrast, rather than spatial
frequency. Thus, there were 5 local conditions with elements varying in contrast
within each dipole, and 4 global conditions with targets at a fixed mid-contrast
and masks at lower and higher contrasts. All targets were rotational Glass patterns
composed of 2 c/deg elements. (Because findings from Experiment 1 and from a pilot
version of Experiment 2 indicate that performance is ostensibly similar across
all transformations, we will consider only the detection of rotational Glass patterns
in Experiments 2-3.) Again, Table 1 gives the values of
the relevant experimental parameters, and note that the ranges of local/global
contrasts used were identical to those used in Experiment 2 to allow comparison
across experiments. Casual inspection of the examples shown in Figure
7 suggests that we are tolerant of quite a wide range of contrast variation
within dipoles (Figure 7a and 7b) but are more able to ignore the low-contrast
masks (Figure 7d) than the high (Figure 7c).

 |
Figure
6. Examples of the stimuli from Experiment 2. Elements varied in spatial
frequency but were equated for RMS contrast. Local grouping condition: dipoles
are composed of 2.0 & 4.0 c/deg (a) and 2.0 & 1.0 c/deg (b), where
elements have been matched for RMS contrast. Global grouping conditions:
patterns consist of 2.0 c/deg dipoles intermixed with masking dots at 4.0
c/deg (c) and 1.0 c/deg (d).
|
 |

 |
Figure
7. Examples of the stimuli from Experiment 3. Elements varied in RMS/Michelson
contrast but were matched in spatial frequency (2 c/deg). (a,b) Local grouping
condition; dipoles are composed of elements with contrasts of 50% and 25%
(a) and 50% and 100% (b). (c,d) Global grouping conditions; patterns consist
of 50% contrast dipoles intermixed with masking dots at 25% (c) and 100%
(d).
|
 |
Figure 8 summarizes data from Experiments 1-3 for the detection
of rotational Glass patterns. Local tuning (Figure 8a and
8b) is clearly tuned for RMS contrast-matched spatial frequency variation (grey
squares) but only weakly tuned for pure contrast changes with 2 c/deg elements
(open triangles). This is consistent with local structure being grouped using
a simple filtering scheme where it is spatial frequency similarity that primarily
determines strength of grouping. In the context of a local filtering scheme, there
are two reasons why changes in local spatial frequency might be more disruptive
than local contrast variation. The first is that the image undergoes some form
of early contrast gain control prior to filtering. However, this account predicts
broad tuning for both local and global tuning when we do not observe the former
(Figure 8c and 8d). The second explanation, which we favor,
involves filter selection. If it were the case that our spatial filters perfectly
integrated contrast energy, then based on the principle of univariance, the spatial
frequency and contrast changes we examined should be equivalent. However, assuming
that the visual system has spatial frequency selective receptive fields that are
well modeled by oriented filters such as Gabors, then the spatial frequency difference
between the dipole elements force the visual system to use nonoptimally tuned
filters (presumably operating at spatial frequencies midway between the two elements).
This reduces their efficacy at integrating contrast energy. Changes in contrast
will not force this compromise in tuning because the optimal spatial frequency
of the filter will simply be at or close to the spatial frequency of the two elements.
This predicts more efficient integration of contrast (rather than spatial frequency)
varying dipoles, and thus a broader tuning in the latter case than in the former.
Results from the global grouping condition (Figure 8c and
8d) indicate that subjects still show clear low-pass tuning for RMS matched stimuli
(grey squares); they are unable to ignore low-frequency masks even though they
are now at a substantially lower Michelson contrast than the target structure.
This shows that visibility cannot account for the low-pass tuning observed for
global grouping in Experiment 1. Tuning for pure contrast changes at a fixed spatial
frequency (open triangles) is somewhat more ambiguous but suggests that the global
grouping system is selective for both contrast and spatial frequency. Subjects
show a degree of contrast-tuning in that both are more affected by the presence
of a high-contrast than a low-contrast mask, but data from subject P.J.B. show
a weaker dependence on mask contrast. Such differences are likely to arise from
subtle differences in the observers" strategies for performing this task.
This result is contrary to some recent evidence bearing on contrast tuning for
Glass patterns. Earle (1999) presented subjects with
Glass patterns composed of L-shaped dot triples that contained ambiguous horizontal
and vertical structure. The salience of horizontal and vertical structure was
measured as a function of the relative contrast of the dots. When two of the elements
are low contrast and the third is high contrast, energy models based on simple
filters predict that apparent structure will be dominated by the structure with
highest overall contrast (ie, between elements of dissimilar contrast). However,
the most salient structure was actually determined by contrast similarity, even
between low-contrast elements. Grouping by contrast-similarity predicts that we
should find band-pass contrast tuning for global grouping rather than the low-pass
tuning we observe in Figure 8c and 8d. We conjecture that
grouping by contrast similarity may be possible only under quite specific conditions
and may depend critically on local spatial configuration (spacing/density, "clustering"
of low-high elements) and/or the spatial frequency structure of dots.
 |
Figure
8. Comparison of the tuning of local and global grouping for spatial frequency
(filled circles), RMS-matched spatial frequency (grey squares), and contrast
(open triangles). Note the dual abscissas: the lower is for data from the
fixed Michelson (variable spatial frequency) condition; the upper is for
the fixed spatial frequency (variable RMS contrast) condition; and both
apply to data collected with fixed RMS (covarying Michelson contrast/spatial
frequency). (a,b) Local grouping is tuned for spatial frequency irrespective
of contrast and is weakly tuned for pure contrast changes. (c,d) Global
grouping shows dependence on both contrast and spatial frequency.
|
 |
 |
Figure
9. (a) Center-surround Laplacian-of-Gaussian elements uniquely stimulate
local grouping mechanisms such as oriented filters(shown as a translucent
overlay) when presented in pairs, but not in isolation. We refer to this
as "pure" local grouping. (b) Larger groupings across space are
unlikely to be detected by such local filtering, since pairings are randomly
distributed throughout the image, implying that a more global grouping mechanism
must be used. (c) Although contour stimuli presumably exploit a similar
global grouping mechanism to (b), pair-wise coalignment of oriented features
might also be signaled to some degree by local grouping mechanisms. The
"multi-local" groupings might also feed into the global grouping
mechanism.
|
 |
Reduction of element contrast is not the only way that the global energy of
low- and high-frequency masks can be equated; one can also alter their densities,
and it is possible that low-frequency masks are more effective not because of
their spatial frequency but because their elements are larger and have a greater
"coverage" of the stimulus. (We are grateful to an anonymous reviewer
for this information.) To examine this possibility, we conducted a control experiment.
Subjects were presented with rotational Glass patterns composed of 100 elements
at 2.0 c/deg embedded in random-dipole masks composed of 25, 100, or 400 elements
at either 1.0, 2.0, or 4.0 c/deg. The coverage of these conditions is now matched
and under these conditions we do indeed observe equal performance for both subjects
[S.C.D.: mean threshold of 0.35 (SE = 0.05), 0.33 (0.02), and 0.38 (0.03); P.B.:
0.38 (0.02), 0.36 (0.08), and 0.36 (0.07)]. These findings are not incompatible
with a global integration mechanism with low-pass tuning, which would predict
that changing the density/energy of the mask would change performance. Note
also that this finding is only suggestive that the coverage of the mask is an
important parameter; because we also varied the number of elements in the mask,
we cannot be certain that this is the case without systematically covarying
mask density, extent, and numerosity (Dakin, 2001). By conducting this procedure at a series
of mask spatial frequencies we are presently attempting to disentangle spatial
frequency, density, number, and spatial extent to determine which parameters
determine global masking in these displays.
To summarize, we have demonstrated a substantial qualitative difference
between local and global grouping processes in visual texture perception; the
former are narrowly tuned for spatial frequency structure, and the latter show
broader, low-pass tuning. Performance on local grouping is consistent with previous
modeling of detection psychophysics, indicating that subjects must be using a
relatively narrow range of filters to process Glass patterns; otherwise, they
would be swamped by noise from adjacent bands (Dakin, 1997a).
That local grouping is spatially band-pass is consistent with the notion that
cells in area V1 implement the filters responsible. The global grouping experiments
shed some light on how the visual system might then combine together these filter
outputs. The global grouping mechanism shows clear low-pass spatial frequency
selectivity (because we observe low-pass tuning even with RMS-matched elements)
but our data would also appear to indicate a greater degree of tuning for the
contrast of the mask than shown by the local grouping mechanism. Thus the global
grouping mechanism may combine various attributes of local features and could
be characterized as being tuned to something more akin to "visibility."
In the "Introduction," we alluded to previous findings that spatial frequency
tuning observed for texture segmentation (Kingdom &
Keeble, 2000) and contour detection is band-pass (Dakin
& Hess, 1998). Given that both contour integration and the global Glass
pattern task require subjects to integrate orientation information across space,
these results would appear to be contradictory. Figure 9
illustrates a possible explanation for the difference; it shows schematic diagrams
illustrating the distinction between local and global grouping, in the context
of a local grouping mechanism based on oriented filters. In the former case, individual
features are isotropic and, although they individually do not selectively stimulate
any one filter orientation, pairs of features that are close enough together,
do. Thus, local grouping cares about the relative position of input features.
In the global case, provided that feature pairings are relatively sparse, an oriented
filtering mechanism continues to give useful information only about local groupings.
Larger, more complex assemblies must be signaled by a mechanism combining responses
across space. This is what has traditionally been thought of as a "texture" process
in that global grouping cares little about the relative position of input features.
Figure 9c shows what we term the "multi-local" case. While
both contour and Glass pattern stimuli require orientation integration across
space, only in the contour case is the stimulus arranged in such a way as to facilitate
interactions between orientation signals; features are densely packed and positioned
so that their local orientations are coaligned along an imaginary underlying "backbone."
While we know that the conditions under which a whole multi-element contour can
be signaled by large filters are quite limited (Hess &
Dakin, 1997), that is not to say that the response of large filters to pair-wise
groupings in the contour might not be important for binding these elements across
space. Contour linking seems to straddle our definitions of local and global grouping.
In isolation, local features do stimulate oriented filters; thus their grouping
must in some sense be a global linking task. However, like a local grouping task,
contour linking must care about position. Moreover adjacent contour elements can
mutually stimulate oriented filters operating at a coarser scale so that the contribution
of the relative position of contour elements to grouping may ultimately be linked
to the degree to which adjacent contour elements mutually stimulate local grouping
mechanisms. If one hypothesizes that these pair-wise or multi-local groupings
contribute to contour linking (the link marked with a "?" in Figure 9c), then because local grouping is primary (in that
global grouping cannot proceed without it), one can see how contour detection
might exhibit spatial frequency tuning properties more akin to local grouping.
Although the details of the feasibility of pair-wise contour linking is beyond
the scope of this paper, we are presently investigating the role of interactions
between adjacent elements in contour linking.
S.C.D. was supported by a fellowship from the Wellcome Trust.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision,
10, 433-436.
[PubMed] Dakin, S. C. (1997a). The detection of structure in Glass patterns: Psychophysics
and computational models. Vision Research, 37, 2227-2259.
[PubMed] Dakin, S. C. (1997b). Glass patterns: Some contrast effects re-evaluated.
Perception, 26, 253-268.
[PubMed] Dakin, S. C. (1999). Orientation variance as a quantifier of structure
in texture. Spatial Vision, 12, 1-30.
[PubMed] Dakin, S. C. (2001). Information limit on the spatial integration of local
orientation signals. Journal of the Optical Society of America A. Optics
and Image Science Vision, 18, 1016-1026.
[PubMed] Dakin, S. C., & Hess, R. F. (1998). Spatial-frequency tuning of visual
contour integration. Journal of the Optical Society of America A. , 15,
1486-1499.
[PubMed] Dakin, S. C., & Bex, P. J. (in press). Summation of global orientation
structure: Seeing the Glass or the window? Vision Research. Earle, D. C. (1999). Glass patterns: Grouping by contrast similarity.
Perception, 28,1373-1382.
[PubMed] Field, D. J., Hayes, A., & Hess, R. F. (1993). Contour integration
by the human visual system: Evidence for a local "association field."
Vision Research, 33,173-193.
[PubMed] Field, D. J., Hayes, A., & Hess, R. F. (2000). The roles of polarity
and symmetry in the perceptual grouping of contour fragments. Spatial Vision,
13, 51-66.
[PubMed] Gallant, J. L., Braun, J., & Van Essen, D. C. (1993). Selectivity for
polar, hyperbolic, and Cartesian gratings in macaque visual cortex. Science,
259, 100-103.
[PubMed] Glass, L. (1969). Moiré effects from random dots. Nature, 243,
578-580.
[PubMed] Hess, R. F., & Dakin, S. C. (1997). Absence of contour linking in
peripheral vision. Nature, 390, 602-604.
[PubMed] Hess, R. F., Dakin, S. C., & Field, D. J. (1998). The role of "contrast
enhancement" in the detection and appearance of visual contours. Vision Research,
38, 783-787.
[PubMed] Kingdom, F. A., & Keeble, D. R. (2000). Luminance spatial frequency
differences facilitate the segmentation of superimposed textures. Vision
Research, 40, 1077-1087.
[PubMed] Marr, D. (1982). Vision. San Francisco, CA: Freeman. Moulden, B., Kingdom, F., & Gatley, L. F. (1990). The standard-deviation
of luminance as a metric for contrast in random-dot images. Perception, 19,
79-101.
[PubMed] Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics:
Transforming number into movies. Spatial Vision, 10, 437-442.
[PubMed] Pelli, D. G., & Zhang, L. (1991). Accurate control of contrast on
microcomputer displays. Vision Research, 31, 1337-1350.
[PubMed] Prazdny, K. (1986). Psychophysical and computational studies of random-dot
Moiré patterns. Spatial Vision, 1, 231-242.
[PubMed] Stevens, K. (1978). Computation of locally parallel structure. Biological
Cybernetics, 6, 19-28. Stevens, K., & Brookes, A. (1978). Detecting structure by symbolic
constructions on tokens. Computer Vision, Graphics and Image Processing,
37, 1133-1145. Watson, A. B., & Pelli, D. G. (1983). QUEST: A Bayesian adaptive psychometric
method. Perception & Psychophysics, 33,113-120.
[PubMed] Watt, R. J. (1988). Visual Processing: Computational, Psychophysical and
Cognitive Research. London: Lawrence Erlbaum Associates. Wilson, H. R., & Wilkinson, F. (1998). Detection of global structure
in Glass patterns: Implications for form vision. Vision Research, 38, 2933-2947.
[PubMed] Wilson, H. R., Wilkinson, F., & Asaad, W. (1997). Concentric orientation
summation in human form vision. Vision Research, 37, 2325-2330.
[PubMed] Zucker, S. W. (1985). Early orientation selection: Tangent fields and
the dimensionality of their support. Computer Vision, Graphics and Image
Processing, 8, 71-77.
|
|