MET x Microsoft x MIT/CAST Proposal:
Can a neural network analyze the museum collection database for similarities and differences in the two distinct visual and informational regimes of the human visual system? The goal is use these results to produce a new kind of artwork based on the biology and neurology of deferred perception.
Nobel prize winner Eric Kandel has argued that while the abstract artists of the New York school reduced the complex world around us to its essence of form, line, color and light, we are still only at the beginning of understanding the biological underpinnings of our response to abstraction, a process that involves both bottom up, top-down visual processing, the so-called default network and construal-level theory. Here, we continue his argument for a more complex and richer analysis and understanding of visual and cognitive processing, using the history of human art in the collection of the Metropolitan museum API as a substrate and propose several distinct ways in which an advanced neural network can analyze and reinterpret the collection.
Visual experiences and aesthetics are both defined by two distinct groups of human neural preferences. Foregrounded experiences are based on vivid, rapid inferences derived from the densest part of the visual cortex. The density of cones in the fovea, matched by a density of processing in the visual cortex leads to this color and brightness preference and so rapid inference dominates conventional 'bottom' up visual processing. Simplified sorting, matching and averaging of form, line, color and light to objects and details leads to easier recognition - but also image standardization. Pre-projection of sound-image associations preferences leads to content-image bias and finally to pattern subsampling. In the end, self-organized criticality requires the avoidance of noise and truly hallucinatory phenomena and subsampling - leading to a strong overestimation of stability in a large class of time evolving systems in neural criticality, false pattern recognition and apophenia, while simplified neural preferences leads to color subsampling - (dithering) - an impoverishment of the range of colors and picturing modes available to the human eye.
Form, line, color and light and residual color all begin to operate very differently when we move towards the shadows. Rods outnumber cones by ten to one - but are slower and more actively engaged with abstract, temporal phenomena and slower, or 'deferred' inference may be critical in re-balancing the biased, under-sampled or over-stabilized categories of the visual system. In many cases, partial inference is more accurate, efficient and engaging for more complex abstract phenomena than complete neural criticality - as the default neural network is more fully engaged with construal-level thinking. The appearance and effect of hallucinatory form constants and related optical effects in this area of the human visual system are well known, as is the effectiveness of aesthetic experience on the default or 'daydreaming' network which combines memory, sensation and mind.
How do human artworks use these two mechanisms, and is their use uniform across culture and history? To research this further, we might look at types of artworks from diverse human cultures in the collection of the Metropolitan museum and use an artificial quasi critical regime of our own. Obviously the first temptation will be to cross match already foregrounded informational and visual similarities. What does the average Met artwork looks like? Of course, this instantly exposes a serious problem. Not all cultures produced equal amounts of artwork of any given type, nor were they collected and defined in the same ways. So far, using neural networks to seek averaged and matched images as well as mixing visual preferences and sorting criteria has led only to pre-sorted superficially intermixed informational and visual states. The next step must be a more ambitious effort to define the perimeter and larger limits of human visual experience.
This is where deferred inference may become especially useful. New topologies may permit new forms of deferred inference. Correctly describing the conditions and interpretation of partially legible visual phenomena emerging in complex neural and visual and pictoral systems may even be improved by adding noise and topological variation to varying two and three dimensional media representing these diverse time evolving systems. Induced apophenia may rebalance word-image bias and the over-stabilized aesthetics of time evolving visual systems. Complex visual noise may improve attention and processing. New geometric topologies may lead to more complex visual experiences. Induced ideasthesia in VR with added visual and auditory layering over new topologies and manipulation may improve attention to, and processing of, complex time evolving systems.
With all that in mind, here are some specific ideas on how we could develop this idea of deferred inference using the Met Collection. I’m assuming that we will need to concentrate on the 2-D works, as variations in 3-D works will make useful comparison impossible. Hopefully, the network can be trained to compare a number of variables and produce a collection of images, formatted in a print ready format (such as a Tiff file) - to be composited and output by the 3-D robot in January.
Forms of deferred inference.
Proportions of light, transitional (shadow) and dark areas in images.
a. Are there proportions of light, transitional (shadow) and dark areas that induce informational deferral?
b. Do contrast ratios affect contour mapping?
c. Are there regional or temporal biases towards amounts of light, transitional and dark areas in images?
2. Internal topology of picture spaces such as scrolls, pages, fabrics and paintings.
a. Do different internal topologies of picture spaces induce informational deferral?
b. How do internal topologies of picture spaces affect contour mapping in the visual system?
c. Are there regional or temporal biases towards internal topologies of picture spaces?
3. Noise and non-representational marks.
a. Are there proportions of noise and non-representational marks that induce informational deferral?
b. What kinds of noise and non-representational marks such as neural noise, stochastic noise, inverse stochastic noise induce informational deferral ?
c. Are there regional or temporal biases towards types of noise and non-representational marks?
Word-image association bias.
a. Are there combinations of labeling that induce bias?
b. What kinds of recombinations of labeling could induce informational deferral ? I have selected five groupings of cultural aesthetic terms (pre-modern, classical, medieval, modern and post modern) that we can use to re-label works.
c.Are there regional or temporal biases towards types of labeling?
add noise (what type of noise is interesting) to images then use as input data to a GAN
(or map images onto different topologies)
simultaneously train NN classifier on aesthetic classification standards (—> “an automatic gallerist?”)
Trained GAN generates new works, use as input to classifier
potentially add noise *after*
apply noise along dimensions of the topological space of the image
reapply noise (sfumato) to upgrade the image afterward