Sound-Induced Mental Imagery for the Blind


« The vOICe Home Page

Visual imagery may be loosely defined as visual mental processing that resembles the perceptual mental processing normally induced by the eyes, but now occurring in the absence of direct sensory stimuli from the eyes. This visual mental processing may be characterized by certain (conscious, subjective) experiences, but also by vision-related cognitive processes and representations in the brain that are not, or not directly, accessible to consciousness. Visual imagery encompasses (the visual aspects of) dreaming, visual memory, (cognitive) imagination, and any visual mental (re)constructions involved in the latest information-rich synthetic vision technology for the blind.

In the literature, the term mental imagery is often used almost as an alias for visual imagery (visualization, or "seeing in the mind's eye") due to the attention given to vision, although strictly speaking one should reserve the mental imagery term for more general use,
Blind user with The vOICe auditory display

The vOICe for synthetic mental imagery: "seeing with your ears". Adam Shaible, 37, gets visions through glasses with a camera and headphones hooked to a laptop computer. The image on the monitor is that of photographer Susan Stocker as she takes his photo.

Photography: Susan Stocker for Sun-Sentinel ("Device lets man, blind since birth, hear the sights of sunset", April 26, 2004)

including, e.g., auditory imagination, imagining sounds. On the other hand, with synthetic vision for the blind, the internal mental representations may be, but need not be, truly visual in terms of quasi-pictorial or "screen-like" internal representations, so to leave this issue open it may be better to speak of mental imagery rather than just visual imagery here. Therefore, we will mainly use the mental imagery term from now on, although we use and apply it only in the context of vision and vision substitution.

Note that with sensory substitution for the blind, the mental processing and experiences will in fact be much less subjective than with most of the other forms of mental imagery, because it is again bound rather directly and tightly to sensory input, although not from the eyes like with natural vision, but from a substituting auditory display or a tactile (haptic) display. The imagery content is set mainly by the alternative sensory input and less directly by conscious control - although one can of course focus attention to parts of the perception or of any corresponding short term sensory memory (sensory buffer) as well as control where the alternate input device is being "pointed at".

The topic and concept of mental imagery has been much discussed and debated in the literature, but so far mainly in a philosophical and psychological setting. This makes that although some publications do address the meaning of mental imagery to, for instance, the congenitally blind (e.g., John Kennedy), surprisingly little has been published about the relevant mental imagery questions as now posed by new technology options for substituting one sensory stream by another. This is most unfortunate, because state-of-the-art synthetic vision technology, based on camera or sonar sensor input and auditory or tactile display output, poses very concrete and socially relevant questions about (the limitations of) mental imagery, and the alternate sensory induction thereof (synthetic imagery), questions that should be leading to significant follow-up in psychological theory, prediction and experiment. See also the SIGGRAPH 98, VSPA 2001, NIC2001 and Tucson 2002 presentations on this topic. Moreover, technologies for sensory substitution allow for testing philosophical ideas and hypotheses about the essence of vision, for instance to help establish to what extent sensorimotor contingencies (sensory invariants) can suffice for evoking visual sensations and visual experiences (Cf.  Alva Noë, ``Is the visual world a grand illusion?'' and related work by Kevin O'Regan, and the claim of sensorimotor dependency theory as formulated by Andy Clark in  ``Cognitive complexity and the sensorimotor frontier,'' reading "towhatever extent it is possible to recreate the same body of sensorimotor dependencies using an alternative route, you will recreate the full content and character of the original perceptual experience", Aristotelian Society Supplementary Vol. 80, No. 1, pp. 43–65, 2006).

Historically, part of the debate on mental imagery has circled around the behaviorist criticism on the non-provable existence of subjective experiences (e.g., Watson). With modern brain-scanning techniques like PET and fMRI, which show increasingly reliable correlates between the subjective and the objective, this fundamental scientific criticism starts to lose ground. Another source of much debate has been the concept of quasi-pictorial or "screen-like" visual imagery as a (conscious) experience (Stephen Kosslyn), "versus" or supplemented by the underlying (partly non-conscious) visual processing and representation needed to do the coding or decoding (Zenon Pylyshyn). It seems likely that neuroscience will help disentangle some of these issues in the forthcoming years, and help clarify which ideas and categories turn out to be mainly conceptual artefacts of our culture and philosophy, and which ideas can be demonstrated to have a locatable neural substrate. For instance, the relations and distinctions between short term visual memory (iconic memory) and spatial processing with its working memory (as used in navigation) are currently subject to intensive neuroscientific research.

Live Sound-Induced Visual Imagery
Blind user wearing The vOICe
This blind user wears The vOICe daily, here with a hidden "spy camera" inside the hat: a "hatcam". The notebook PC is inside the backpack.
Photography: courtesy Barbara Schweizer
The vOICe vision substitution and synthetic vision technology, as presented on this website, is about a new mental imagery approach via live sound-induced and sound-controlled visualisation or "perceptualization", for which first real-time hardware was developed, later followed by a multimodal online Java applet demonstrator as well as fully integrated video sonification software for blind users in the form of The vOICe for Windows. Having been developed as an experimental system for auditory image representations, it is intended to find applications as a synthetic vision device for the blind. As such, it is sufficiently general to support situated vision, suited for the dual purposes of object recognition and the control of action (Pylyshyn). Presently, The vOICe offers a first order approach consistent with basic psychophysical knowledge about the human hearing system, while containing a sufficiently large set of parameters to allow for much of the subsequent second order fine-tuning for improved perceptual performance as more knowledge about human auditory processing and perception becomes available. See also the auditory model page. The big question in the context of mental imagery is to what extent The vOICe approach can indeed be used to replace regular vision via sound-induced mental imagery, or "scaffolded" mental imagery, making for a kind of artificial synesthesia (which can be viewed as a special form of acquired synesthesia). One could also ask the question if using The vOICe could help in preventing, guiding or treating Charles Bonnet syndrome (CBS), which is about visual hallucinations that occur in people who go blind, most likely due to visual sensory deprivation: one conjecture could be that satisfying the brain's craving for meaningful visual information would help "bind" mental imagery (e.g., via the visual association cortex) to meaningful visual input via The vOICe rather than to any internal "noise" that evokes the visual release hallucinations.

The vOICe makes for a socially relevant basic research vehicle for studying the crossmodal binding problem where one hopes to bind live visual input from a camera to corresponding visual qualia (visual sensations) with a minimum of training time and effort. As far as neuroscience research is concerned, one should remain very cautious about claiming "vision" even if it turns out that brain plasticity and extensive immersive usage of The vOICe leads to increased or modified activity in, say, the occipital lobe of blind people in relation to soundscapes. Vice versa, one should remain careful about excluding even the remote possibility of "vision" with those afflicted by cortical blindness when visual input might be processed elsewhere in the brain.

Note that the occipital lobe is often simply called the "visual" cortex because of its role in sighted people, but we know that this part of the brain also gets activated by tactile and auditory stimuli, particularly in congenitally blind people, so the more neutral term occipital lobe is more appropriate here.

At a fundamental or even philosophical level, we can never really distinguish whether any increased activity in the occipital lobe (or any other brain areas normally associated with vision) in relation to soundscapes will be "truly visual" processing or "merely" much-extended and advanced auditory processing - however effective the training results may be for functional performance in what we would normally consider "visual" tasks. For functional performance questions one may indeed obtain objective scientific results. However, for many people an important question is also whether or not any alternative cross-modal mappings for blind people will eventually subjectively "feel" like vision (qualia), along with the vivid and stable sensation of light perception, and with the subjective realism of eidetic imagery. The qualia question with respect to perception of light may be particularly hard to answer objectively/scientifically, other than that one might try to find answers in a statistical "polling" sense, by gathering the (subjective) verdicts from a group of trained late-blinded people or blindfolded sighted volunteers, asking them how vision-like the experience has become after extensive usage, i.e., after exploiting whatever brain plasticity there is. The opinions and verdicts may well differ individually, so that one gets at best a majority vote on whether or not soundscapes do indeed "feel" like vision after adaptation to the cross-modal mapping. Even if most people in this group would judge that hearing soundscapes of camera views eventually begins to "feel" like vision, this need not extend to congenitally blind people who have had a different brain development and lack even the subjective reference point to judge whether the end result "feels" like (memories of former) vision or not. There are many subtle issues to be resolved in answering what really makes up vision. For instance, there are sighted people with a condition called prosopagnosia who have normal vision except for a inability to recognize faces. It seems reasonable to expect that visual prostheses too will lead to certain "biases" in vision, where some aspects of vision appear normal and other aspects appear deviant. For practical applications we need not be too concerned with the philosophical questions of separating vision from hearing (if that is meaningful at all), because we can then focus mostly on functional performance (e.g., in orientation, navigation and mobility, in "reading" graphs, etc.), as well as on the subjective verdict of blind users about their visual experiences, indiscriminate of whether the underlying neural processing is considered to be auditory or visual in nature. For some preliminary accounts from blind users see the page on what blind users say about The vOICe.

Inverse retinotopy: reading the brain visual system as an inverse problem Selected related literature:

Bértolo, H., ``Visual imagery without visual perception?'' Psicológica, Vol. 26, pp. 173-188, 2005. Available  online (PDF file).

Cui, X., Jeter, C.B., Yang, D., Montague, P.R. and Eagleman, D.M., ``Vividness of mental imagery: individual variability can be measured objectively,'' Vision Research, Vol. 47, No. 4, pp. 474-478, February 2007. Available  online (PDF file).

Merabet, L.B., Kobayashi, M., Barton, J. and Pascual-Leone, A., ``Suppression of Complex Visual Hallucinatory Experiences by Occipital Transcranial Magnetic Stimulation: A Case Report,'' Neurocase, Vol. 9, pp. 436-440, October 2003.

Merabet, L.B., Maguire, D., Warde, A., Alterescu, K., Stickgold, R. and Pascual-Leone, A., ``Visual hallucinations during prolonged blindfolding in sighted subjects,'' Journal of Neuro-Ophthalmology, Vol. 24, pp. 109-113, June 2004. Available  online (PDF file).

Noë, A., ``Is the visual world a grand illusion?'' Journal of Consciousness Studies, Vol. 9, No. 5/6, pp. 1-12, 2002. Available  online (PDF file).

O'Callaghan, C., ``Seeing What You Hear: Cross-Modal Illusions and Perception,'' Philosophical Issues, 2008.

Pearson, J., Clifford, C.W.G. and Tong, F., ``The Functional Impact of Mental Imagery on Conscious Perception,'' Current Biology, Vol. 18, pp. 982-986, July 2008. Available  online (PDF file).

Slotnick, S.D., ``Imagery: Mental Pictures Disrupt Perceptual Rivalry,'' Current Biology, Vol. 18, pp. R603-R605, July 2008. Available  online (PDF file).

Sparing, R., Mottaghy, F. M., Ganis, G., Thompson, W. L., Töpper, R., Kosslyn, S. M. and Pascual-Leone, A., ``Visual cortex excitability increases during visual mental imagery - a TMS study in healthy human subjects,'' in Brain Research, Vol. 938, pp. 92–97, 2002.

Thirion, B., Duchesnay, E., Hubbard, E., Dubois, J., Poline, J.B., Lebihan, D. and Dehaene, S., ``Inverse retinotopy: Inferring the visual content of images from brain activation patterns,'' NeuroImage, Vol. 33, No. 4, pp. 1104-1116, 2006. Abstract available  online.

Literature and conference presentations on The vOICe approach:

Meijer, P.B.L., ``An Experimental System for Auditory Image Representations,'' IEEE Transactions on Biomedical Engineering, Vol. 39, No. 2, pp. 112-121, Feb 1992. Reprinted in the 1993 IMIA Yearbook of Medical Informatics, pp. 291-300.

Meijer, P. B. L. ``Cross-Modal Sensory Streams,'' Conference Abstracts and Applications, ACM SIGGRAPH 98, p. 184, as part of an invited panel presentation and demonstration of The vOICe for Windows at SIGGRAPH '98, Orlando, Florida, 1998.

Meijer, P. B. L. ``Seeing with Sound for the Blind: Is it Vision?,'' invited lecture at the VSPA conference on Consciousness at the University of Amsterdam, Amsterdam, The Netherlands, June 1, 2001.

Meijer, P. B. L., ``Seeing with Sound: Wearable Computing for the Blind,'' invited presentation at NIC2001 (Nordic Interactive Conference), Copenhagen, Denmark, Thursday November 1, 2001.

Meijer, P. B. L., ``Seeing with Sound for the Blind: Is it Vision?,'' invited presentation at the Tucson 2002 conference on Consciousness in Tucson, Arizona, USA, April 8, 2002.

Fletcher, P. D., ``Seeing with Sound: A Journey into Sight,'' invited presentation at the Tucson 2002 conference on Consciousness in Tucson, Arizona, USA, April 8, 2002.

Stoerig, P., Ludowig, E., Meijer, P. B. L. and Pascual-Leone, A., ``Seeing through the ears?,'' poster presentation at the 4th Forum of European Neuroscience (FENS Forum 2004) in Lisbon, Portugal, July 14, 2004.

Stoerig, P., Ludowig, E., Mierdorf, T., Oros-Peusquens, A., Shah, J. N., Meijer, P. B. and Pascual-Leone, A., ``Seeing through the ears? Identification of images converted to sounds improves with practice,'' poster presentation at the 34th Annual Meeting of the Society for Neuroscience (SfN 2004) in San Diego, USA, Sunday October 24, 2004.

Amedi, A., Bermpohl, F. , Camprodon, J., Fox, S., Merabet, L., Meijer, P. and Pascual-Leone, A., ``Neural correlates of visual-to-auditory sensory substitution in proficient blind users,'' poster presentation at the 12th Annual Meeting of the Cognitive Neuroscience Society (CNS 2005) in New York, USA, April 11, 2005, and at the 57th Annual Meeting of the American Academy of Neurology (AAN 2005), Miami Beach, Florida, USA, April 10 and 12, 2005.

For other useful literature, see recent publications on mental imagery.

Note: Psychologists might wish to consider the proposed immersive usage of The vOICe during training in the context of James Gibson's ecological theory of perception (Gibson, J.J., 1979, "The ecological approach to visual perception," Boston: Houghton Mifflin).

A good entry point to mental imagery and its history in science, including many references, is the website of  Nigel Thomas of California State University on Mental Imagery, Consciousness, and Cognition. Other good resources for related (philosophical, psychological) topics are the websites of  David Chalmers of the Australian National University,  Alva Noë of the University of California, Santa Cruz,  Zenon Pylyshyn of Rutgers University,  Stephen Kosslyn of Harvard University, and  Daniel Dennett of Tufts University. For more information on The vOICe technology, visit The vOICe Home Page.

The Molyneux problem revisited using The vOICe?
Molyneux problem: sphere The seventeenth-century philosopher William Molyneux, whose wife was blind, asked his friend John Locke whether a man born blind, in case he recovered his sight, would be able to tell a cube from a sphere by sight alone, with previous sensory experience with cube and sphere limited to touch. Molyneux thought the answer was ``No'', and Locke agreed (in ``An Essay Concerning Human Understanding'').

With visual objects observed via soundscapes from The vOICe, the situation is different, and the answer would be ``Yes'': it is certainly possible to hear the roundedness of a sphere as compared to the more sudden sound transitions observed with an upright cube. What does that mean? Could it mean that seeing-with-sound is for early-blind people actually closer to vision than having biological eyesight surgically restored? It is certainly closer to vision than the phenomenon called "blindsight" in that the visual content as encoded in sound can be consciously accessed. See also the sound samples in the Powerpoint file  "molyneux.ppt" (750 K).

Molyneux problem: cube

In addition, VRML/X3D versions of 3D mental rotation test objects are available on the Shepard-Metzler page.


The combination of smart camera and other sensor technologies with general synthetic vision mappings and automatic scene analysis should eventually offer blind people improved mobility, independence, situation awareness and safety.

Copyright © 1996 - 2024 Peter B.L. Meijer