Auditory Display for Synthetic Vision


« The vOICe Home Page

An auditory display is a device or computer program that represents information through hearing. This may have a wide variety of applications, ranging from the generation of audible warning signals for chemical or nuclear plant operators or fighter pilots to auditory synthetic vision for the blind. The latter application is the focus of this website, and you are invited to explore here what progress in auditory interfaces has been made in recent years.

The vOICe Auditory Display
Soundscapes from The vOICe auditory display - Seeing with your Ears!
Translation of arbitrary images and pictures into sounds forms a new rehabilitation technology with unknown prospects, as so far no one has either proved or disproved its practical use as a kind of artificial eye, providing ``earsight.'' Technically, the auditory display approach has now been proven feasible, but the limits of human perception in auditory profile analysis and comprehension of alternative sensory mappings are still largely unknown.

When compared to alternatives like electrocutaneous or vibrotactile stimulation in haptic displays, a major advantage of auditory displays is that one needs only mature technology for implementation, e.g., a vidicon or CCD/CMOS camera for input and headphones for output, and regular VLSI chip technology in between. Since these components are already mass-produced for other (multimedia and consumer) applications, cost can be low and reliability high. When compared to cortical stimulation using electrodes in a cortical implant placed at the visual cortex, the same advantages of auditory displays apply, plus the fact that auditory displays are non-invasive and do not have the risks involved in experimental brain surgery. Again, w.r.t. human perception none of these alternative technologies have been fully explored, and it is important not to raise unwarrented expectations, nor to be so sceptical as to kill off new options beforehand.

``The vOICe'' is about a new auditory visualisation approach, for which real-time hardware was developed, as well as a multimodal Java applet demonstrator. The vOICe for Windows fully integrates camera input, video sonification and headphones output. Versions of The vOICe for mobile camera phones are also available, in the form of The vOICe for Android and The vOICe Web App. Having been developed as an experimental system for auditory image representations, The vOICe is meant to find applications as an auditory synthetic vision device for the blind.

Explication en Français
La recherche sur ``The vOICe'' s'agit d'une prothèse oculaire (oeil artificiel) expérimental de substitution sensorielle de la vision par l'audition pour les non-voyants. Ainsi, l'aveugle ou le déficient visuel peut obtenir des perceptions visuelles par une représentation auditives temporel. En effet, le vOICe prototype électronique remplace la vision par une modalité sonore avec une résolution de 64 × 64 et 16 teintes de grises. Les capacités perceptives et cognitives humaines d'apprendre utiliser cette représentation acoustique sont maintenant pour la plupart inconnu.

Deutsche Erklärung
``The vOICe'' Forschung handelt sich um eine experimentelle Sehprothese, eine Art von electronische Kunst-Auge für Blinde. Zu diesem Zweck wird ein sensorische Substitution (Ersatz) vom Sehen durch Hören angewendet. Blinde und Sehbehinderte können durch akustische Wahrnehmung, also durch eine nicht-visuelle Modalität, auch gewisse visuelle Empfindungen bekommen, möglicherweise als Orientierungs- und Mobilitätshilfe. Der vOICe Prototyp schaft eine Auflösung mit 64 × 64 Pixeln (eine Bildmatrix von mehr als 4000 Bildpunkten) und 16 Grautöne. Die menschliche perceptuelle und kognitive Fähigkeiten um diese auditive Wiedergabe nützlich verwenden zu lernen sind jetzt noch großenteils unbekannt.

Basically, The vOICe aims to exploit the auditory analysis as performed naturally by the human auditory system, which involves both coarse-grain and fine-grain spectro-temporal analysis, e.g., like in music and speech perception and recognition. Superficially, the early stages in auditory processing resemble and approximate a windowed Fourier analysis, but physiologically a very complicated and only partially understood mixture of - even nonlinear - time-domain and frequency-domain processing underlies human auditory scene analysis (ASA). Time domain processing includes periodicity pitch perception through neural activity patterns being correlated with the sound waveforms. Frequency-domain processing takes place mainly through the basilar membrane resonance in the cochlea, which constitutes a kind of transmission-line-type filter bank mapping relating frequency to position along the membrane. This tonotopic map is more or less replicated in various neural structures up to the primary auditory cortex.

Presently, The vOICe offers a first order virtual audio approach consistent with basic knowledge about the limitations and psychophysical scaling functions of the human hearing system, while containing a sufficiently large set of parameters to allow for much of the subsequent second order fine-tuning for improved perceptual performance as more knowledge about human auditory processing and perception becomes available. See also the auditory model page.

Literature on The vOICe auditory scanning approach: Meijer, P.B.L., ``An Experimental System for Auditory Image Representations,'' IEEE Transactions on Biomedical Engineering, Vol. 39, No. 2, pp. 112-121, Feb 1992. Reprinted in the 1993 IMIA Yearbook of Medical Informatics, pp. 291-300. Abstract and electronic version of full paper available on-line.


For much more background information on sonification (auralization) by The vOICe, visit The vOICe Home Page. General information about developments in auditory displays and augmented reality mappings can be found at the  ICAD site (International Community on Auditory Display). Gregory Kramer has been one of the key people in founding ICAD and in establishing sonification and auditory display technology as a new scientific discipline. See also G. Kramer (ed.), ``Auditory Display: Sonification, Audification and Auditory Interfaces,'' Addison-Wesley, 1994. (ISBN 0-201-62603-9 for hardbound or ISBN 0-201-62604-7 for paperback).

Copyright © 1996 - 2024 Peter B.L. Meijer