Project looks at human eye to sharpen sight of robots and drones

Robots, surveillance cameras and drones could one day detect changes to their environment far more quickly and efficiently, using a vision system based on the way the human eye and brain process information.

The EPSRC-funded Internet of Silicon Retinas (IOSIRE) project, led by researchers from Kings College London and also involving University College London and Kingston University, is aiming to develop advanced machine-to-machine communication systems that capture and transmit images from highly efficient vision sensors mimicking the human retina.

Conventional cameras generate entirely new images for each frame, despite the fact that much of the picture remains the same as that of the previous one. This wastes a considerable amount of memory, computing power and time, according to the UCL principal investigator Yiannis Andreopoulos.

“If you are processing an image to analyse what is happening in a scene, you often end up throwing away most of the background information, because you are only interested in particular shapes or objects,” he said.

In contrast, recently developed dynamic vision sensors (DVS) mimic the way the retina works, by only updating the image at those points where a movement or change in the scene has occurred. When an object moves within a scene it reflects light, which is detected instantly by the sensor, said Andreopoulos.

This significantly increases the speed at which the sensors can produce video frames, resulting in rates of up to 1000 frames per second compared to 20-30 frames per second for conventional cameras.

“And because it is not recording the background, just any changes in the scene, the power consumption is very low – just 10 to 20 milliwatts compared to up to 200 milliwatts,” said Andreopoulos.

Basic processing of images produced by the DVS camera could be carried out locally by the device itself, to produce information needed there and then.

But certain information could also be transmitted to a server in the cloud, at which point more advanced processing and analysis could be carried out, said Andreopoulos.

This resembles the way the mammalian eye is thought to capture scene information, and then transmit it to the visual cortex where the information is processed to generate the three-dimensional rendering of the scene that we “see”.

“In a way, this gives us the illusion that we see this 3D super high-resolution world, but in reality there is very little information being captured by the eye, and to a large extent, the rest is “rendered” in the brain,” said Andreopoulos.

The researchers are aiming for a 100-fold decrease in the amount of energy consumed by the system, when compared to conventional designs for pixel-based visual processing and transmission over machine-to-machine networks.

The project involves Thales, Ericsson, neuromorphic technologies specialist iniLabs, Keysight Technologies UK, and semiconductor company MediaTek.