Related Seeing-with-Sound Projects
« The vOICe Home Page
« The vOICe Web App
« The vOICe for Android
« The vOICe for Windows
Other projects based on or related to The vOICe approach:
- In 2016, Derp Magurp ("magurp244") released
BrushTone, an accessible Paint Tool for the visually impaired that allows users to view, modify, and create images purely using sound. It includes a window scan function based on The vOICe.
- In 2015, Mike Mcwilliams ("aftersight") and Mikael Holmgren ("mrindoj") created the
After-Sight-Model-1 GitHub repository, which includes raspivoice, a version of The vOICe for the Raspberry Pi, and teradeep, a deep learning neural network for visual object recognition.
- In 2015, Derp Magurp ("magurp244") ported The vOICe seeing-with-sound sample code to
Python (I2S*.zip), further using Pyglet, PyAL, OpenAL and Numpy.
- In 2015, "Ar-es" (or Ares) created
raspivoice, a version of The vOICe for the Raspberry Pi. More about this in the Raspberry Pi forum thread
Sight for the Blind for <100$. Mike Mcwilliams developed a Raspberry Pi device for The vOICe.
- In 2014, Quickode Ltd. released for Amir Amedi lab the iOS program
"EyeMusic: Hearing colored shapes"
for iPhone, iPod Touch and iPad.
- In 2013, "berak" created a version of
The vOICe for Linux
(formerly at github.com/berak/seeingwithsound) based on
OpenCV
and
RtAudio.
- In 2013, Gao Yaoyao (name later changed to Zhi Zheng) released the iOS program
"Voice vision"
for iPhone, iPod Touch and iPad.
- In 2012, Technion Ph.D. student Uri Dubin in Israel created the Matlab program
SoundsOfImages,
described as "Can you hear the Image? Tool that allows you to transform image into audible sounds".
- In 2011, Ph.D. student Nicolas Louveton in France created
Wavy,
a flexible visual-to-auditory sensory substitution system written in Python.
- In 2010, blind programmer Michael Curran created
audioScreen, an experimental program
for blind users of Windows 7 touch screens, and based on The vOICe mapping.
- In 2010, Tom Wright (Thomas David Wright), student at the university of Sussex, UK, created a
Pure Data
(Pd patch) implementation of image to sound conversion,
voice.pd.
Also in 2010, he created a "Customisable image sonifier" written in C#,
named SSD1.
Yet another project is his
Polyglot Framework for Sensory Substitution Devices.
- In 2009,
Katarzyna Zarnowiec,
student Automation and Robotics at the AGH University of Science and Technology in Krakow, Poland, and Frederico Contente,
student at Tampere University of Technology, Finland, implemented a Matlab program
"Sound Steganography",
using a mapping related to that used by The vOICe.
- In 2008, Stefan Strahl of UCL Ear Institute created a Google Android implementation of image to sound conversion named "Seer",
available under the Google code open source project
sensub (sensub.googlecode.com).
Submitted as part of the Android Developer Challenge.
- In 2008, Evan Salazar from NMSU created a Perl implementation of image to sound conversion, named imageEncode:
Encoding an image to sound.
- In 2007, Nelson Castillo
from Bogota, Colombia, created a Tetris-like audio game based on free source code of The vOICe, and programmed using Python, the Pygame API and SWIG.
- In 2006, Chris Merck (navaburo) started the open source project
OpenSonify
for combining webcam, PC and headphones.
- In 2006, Luke Barrington,
Ph.D. student in Electrical and Computer Engineering at UCSD, implemented a Matlab version of The vOICe,
"vOICe.m".
- In 2006,
Hans Petter Selasky
developed an image-to-sound mapping similar to The vOICe but based on noise filtering
(subtractive synthesis instead of additive synthesis).
- In 2005, Matt Zukowski of the University of Toronto developed for his undergraduate thesis project
"Vaudiolizer"
sensory replacement system, essentially a Java version of The vOICe.
- In 2005, Clayton Shepard, Richard Hall and Jared Flatow of Rice University developed
Seeing Using Sound,
aiming to simplify images in such a way that only the most important information is conveyed
in the sounds (Elec 301 Project - Fall 2005: Seeing using Sound transforms images into sound to aid blind people).
Scanning is done from the outside-in in time, and different tone scales are used to represent color.
- In 2005, Frederic Paquay from Belgium reported on the possibilities of using camera-based
"depth from defocus" (DFD) for distance estimation, mapping results to sound. His November 2005 document titled
"Technology to help blind people: Image restitution by sonorous signals" (PDF file)
is provided here with his permission.
- In 2005,
Malika Auvray
of COSTECH at the University of Technology of Compiègne and
Kevin O'Regan
of the Laboratory for Experimental Psychology of the University of Paris 5 disclosed some of their work on the "Vibe"
as a system for instantaneous visual feedback, developed by Sylvain Hanneton of the University of Paris 5.
It maps brightness to loudness, height to pitch, and lateral position to stereo sound, somewhat like The vOICe
running in its optional "All at once" mode (menu Options | Video Rate). More information is available on
the SourceForge
thevibe
project page, and the web pages of
Barthélémy Durette
on "Sensory Substitution System for Visual Handicap". See also the ECCV 2008 workshop paper by
B. Durette, N. Louveton, D. Alleysson and J. Hérault,
"Visuo-auditory sensory substitution for mobility assistance: testing TheVIBE".
- In 2005, Igor Bakarčić, Aleksandra Čereković, Egon Geci, Branka Lakić and Ivan Sobota
at the Faculty of Electrical Engineering and Computing at the University of Zagreb, Croatia,
implemented a version of The vOICe in Matlab as documented in
"Mapiranje slike sa zvukom - Seeing with your ears" (no longer online).
- In 2004, George Loo of LKS Labs in Singapore developed a low-cost hardware implementation for The vOICe, named
E-Eye (Ear-Eye).
- In 2004, Sok Hong Teow implemented a version of The vOICe in Matlab for his B.Sc. thesis at the department
of Electronics and Communications Engineering of Curtin University of Technology, Australia, and titled
"Soundscape: Acoustic Imaging of Sight".
- In 2004, a French company named
Primatop
(Fabrice Pajak) started marketing their MIDI-based VisioPlayer System for audio-visual sensory substitution for the blind.
Some of the hardware options for camera input and visual display seemed based on an Age Tech RF-ZLCD30E wireless camera with
LCD receiver ("VisioMonitor" option) and a ZTV 830G Mini Wireless PC Camera Kit, combined with Philips SBC-HC8441 FM wireless
stereo headphones.
- In 2004,
Wolfgang Fink,
then at NASA JPL and Keck School of Medicine at USC, was working on a "Digital Object Recognition
Audio-assistant" (DORA) in cooperation with Mark Tarbell, James Weiland and Mark Humayun.
The system is described as "a camera-input/audio-output system that recognizes color, brightness,
and a number of everyday objects to be verbally announced to the visually impaired or blind patient
on demand". It is not known how this differs from what can be done via The vOICe's
talking color identifier and
Mobile OCR for the Blind interface. A poster was presented at ARVO 2004
(W. Fink, M. Tarbell, J. Weiland and M. Humayun,
"DORA:
Digital Object Recognition Audio-Assistant For The Visually Impaired").
- In 2003,
Nikolaos Bourbakis
of Wright State University (ITRI) and Sethuraman Panchanathan
of Arizona State University (CUbiC, ASU) started a project on a camera-based intelligent assistant for the blind called
"Tyflos".
The focus of the Tyflos/iCare project seems to be more on attempts to recognize and give verbal descriptions
of things, similar to what can be done via The vOICe's Mobile OCR for the Blind interface,
while the main focus of The vOICe is on providing the "raw" visual information via soundscapes.
- In October 2003,
Blue Edge Bulgaria (BEB, www.blue-edge.bg)
released a mobile camera phone implementation of The vOICe, The vOICe BEB,
running on the Nokia 3650.
- In June 1997, a Ph.D. project was started by Phil Picton of the
School of Technology & Design, University College Northampton, UK:
Michael Capp was to investigate auditory mappings based on the work of
Adrian O'Hea, who unfortunately died in 1994 shortly after obtaining
his Ph.D. in Electronics
A. R. O'Hea, ``Optophone Design: Optical-to-Auditory Vision Substitution
for the Blind,'' Ph.D. thesis, The Open University, UK, 1994.
Adrian O'Hea appears to have independently discovered auditory mappings
that are similar to The vOICe mapping (which he calls the Cartesian Piano
Transform). He proposed a number of variations, like the use of a polar
coordinate system to obtain a kind of artificial fovea. The focus of
Michael Capp's Ph.D. work later shifted towards the independent development
of stereo vision options for depth mapping,
in combination with "cartoon mapping" for the preservation of visual texture.
In 2000, Michael Capp received his Ph.D. on ``Alternative approaches
to optophonic mappings'' from Leicester University, UK, with Peter Meijer
acting as invited External Examiner.
- Since about 1995, José Luis González Mora and Luis Fernando Rodríguez
Ramos and colleagues have been working on an auditory display for the blind in the
Virtual Acoustic Space
(Espacio Acústico Virtual, or EAV) project
of the Institute of Astrophysics of the Canary Islands (IAC) and the
University of La Laguna, Tenerife, Spain.
- Since Spring 1999, Julian Rohrhuber and Oliver Wittchow of the University
of Hamburg, Germany, are working on a Nintendo GameBoy version of The vOICe,
which they call the
"nanovoice".
Very interesting and original work!
- On September 16, 1998, the BBC science program Tomorrow's World featured a musical image
to sound mapping devised by
John Cronly-Dillon,
a neurobiologist at the Department of Optometry and Vision Sciences at the University of
Manchester (UMIST), UK. The broadcast showed examples in which the basic characteristics of
transforming shapes into music for hearing images appeared identical to those employed by The vOICe
(and even to those employed by the pianola and optophone): left-to-right scanning, with pitch depending
on elevation, and with a vertical line segment resulting in a chord with all tones sounding
simultaneously, as with the four vertical edges in the MIDI sample
for the box shown on the left. Cronly-Dillon's implementation was a computer program without
live visual input, but he mentioned plans for a future portable system with a camera. According
to an article in The Guardian UK of September 22, 2000, it will be called "SmartSight" and have
the looks of a Star Trek visor. ``Backing up claims that this is an innovation from the realms
of science fiction, the new gadget is almost identical to that worn by blind character Geordi
la Forge on Star Trek's "The Next Generation".'' [...] ``The scientists say an early prototype
will be ready for testing before the end of this year. A battery pack would be worn at the waist,
while the "eye" of the device would either he a hand-held video camera, or two cameras fitted
into a visor worn on the head.''
See also the archived website of the associated
SmartSight Limited startup venture with
John Cronly-Dillon, Krishna Persaud and David Stead. Smartsight Limited was registered at
Companies House on July 15, 1999, under company number 03810177.
![Music score for simple shapes](shape.gif)
"Music" score with MIDI sample for circle, triangle and square
A demonstration was given at the
RNIB Techshare 2001
Conference in Birmingham, UK.
- Starting around 1998, Patricia Arno, Christian Capelle, Charles Trullemans,
Anne De Volder, Claude Veraart and other researchers at the Catholic University of Leuven
in Belgium presented an experimental auditory display for the blind named the
PSVA (Prosthesis for Substitution of Vision by Audition, or in French,
Prothese de Substitution de la Vision par l'Audition).
- In 1998,
Paul Querelle,
then a student at Anglia Polytechnic University in Cambridge England, started working with
Camsight, involving a local support group and a number of blind or partially sighted
local volunteers, to further research in this field. According to his information, area's of
research include: HCI aspects of vision to sound, extracting shape information from images,
extracting depth information from multiple images, locating a mouse pointer using
sound to facilitate menu navigation, use of the inverse Fourier transform with
windowing to construct complex soundscapes, determination of texture in an image
to facilitate OCR and the use of reverberation and other filters to enhance auditory
aesthetics.
- In June 1998, information became available about the work of Harry Reid.
He had invented a portable system which he calls the
Sonic Eye,
A document describing his system is available for download as a zipped MS Word file
harreed.zip, provided here with his permission. It includes
several illustrations clarifying the image to sound mapping concepts that turn out to
be basically the same for The vOICe and the Sonic Eye.
![Music scores for two scenes](harreedy.gif)
The Sonic Eye for walking and reading
``A voice print represents sounds as images. High pitch sounds appear as
marks near the top of the image; low pitch sounds appear as marks near
the bottom of the image. Time is a vertical line moving at a constant speed
from left to right. Thus a voice print can convert any sound into an image.
The sonic eye does the reverse to convert images into sound. The sonic
eye "sees" a thin vertical window wherever it is pointed and makes a sound
combining high pitch for objects near the top of the window and low pitch for
objects near the bottom as the thin window is swept across objects of interest.
The user listens to these sounds and learns about the visual environment.''
[Harry Reid, 1998]
- Since Autumn 1997, ThalesScope Limited has started beta testing a
device called the
ThalesScope,
which is supposed to convert sight into sound. Other than this, little factual
data is available about it at the time of this writing, and it is also unclear
where it would differ from existing The vOICe technology.
ThalesScope Limited is headed by Isaiah W. Cox, and owned in part by
Thales Resources, and in part by Tom Karnes. Borealis Technical Limited
does the build, development, and test work.
The latest information now suggests that this project might have been
discontinued or at least is suffering from serious delays.
- Other related early image-to-sound devices are those that targetted reading
(like the original optophone), such as the Visotoner (using 9 tones) and the
Stereotoner (using 10 tones).
If you know of more projects related to The vOICe, please report.
For this page, "related" means using either the same general image-to-sound
mapping as employed by The vOICe or alternative image-to-sound mappings to be used in
information-rich auditory displays for artificial vision. Related projects in the sense
of targetting visual prostheses for the blind via other combinations of modalities
(e.g., sensory substitution via sonar or radar input, tactile output) are described on
the sensory substitution page.
Note: The vOICe approach was originally published in the
IEEE Transactions on Biomedical Engineering, Vol. 39, No. 2, pp. 112-121, Feb 1992:
P.B.L. Meijer, ``An Experimental System for Auditory Image
Representations.'' This paper was next selected for reprint in the 1993 IMIA
Yearbook of Medical Informatics, pp. 291-300. Awarded U.S. Patent 5097326, on an
"image-audio transformation system", filed July 27, 1990:
``In a device for converting visual images into representative sound
information especially for visually handicapped persons an image
processing unit is provided with a pipelined architecture with a high
level of parallelisum. An image is scanned in sequential vertical
scanlines and the acoustical representatives of the scanlines are
produced in real time. Each scanline acoustical representation is
formed by sinusoidal contributions from each pixel in the scanline,
the frequency of the contribution being determined by the position of
the pixel in the scanline and the amplitude of the contribution being
determined by the brightness of the pixel.''
Copyright © 1996 - 2024 Peter B.L. Meijer