Getting the picture

A surveillance system based on artificial intelligence promises to give clear, unequivocal pictures of those caught on CCTV committing crimes. The system, developed at the University of Portsmouth, improves CCTV image quality by ‘selectively’ enhancing sequences of video footage.

CCTV images seen on television often appear fuzzy or blurred because they are captured on video. ‘Take a CCTV camera that costs £20,000,’ said Dr David Brown from the university’s Institute of Industrial Research. ‘It has excellent optical lenses but once the image goes to video the tape can stretch or it ages and is nowhere near the quality you need.’

When prosecutors present CCTV footage in court they are usually working off a small number of images that mitigate against the defendant or defendants. These single frames are found in the miles of tape police have at their disposal.

The images are usually of poor quality and have to be enhanced using interpolation, a mathematical method of creating missing data. An image can be increased from 100 to 200 pixels through interpolation. While there are many methods of interpolation — one simple way is to generate a new pixel by using the average of the value of the two pixels on either side of the one to be created.

‘So what happens is you enlarge an image and it goes “blocky” because it is guessing where the pixels are and they are often not there at all. You can’t go to a court with a picture like that,’ said Brown.

One solution is to store the video at higher quality, but this requires more storage capacity and is costly.

Lewis Hibell, a PhD student at the university, has devised a method of selective enhancement to reconstruct the original image. This can be achieved by exploiting the fact that different frames of video contain slightly different pieces of data.

Multiple frames are taken from video footage, all of which are different due to the subject moving slightly or the light source changing. Areas of interest in the picture are then identified and separated from the remainder of the video. These areas are used to generate a single stationary clear image of the subject from the video source.

‘This is all about selective enhancement,’ said Brown. ‘We put what we want to see in a box and enhance only what is in the box. Take a person walking down the street: we remove them from the scene, enhance them and place them back into the scene. You can then subdivide the picture and do further enhancement, say to a person’s arm.’

Hibell pointed out that there are already algorithms that can enhance a poor-quality picture, but they tend to be applied to single images and not a sequence. the newly developed system presents good-quality images in a series of frames.

The artificial intelligence/machine-learning aspect of the research is the neural network being used. ‘This is a new neural network I have created known as a Self-Delaying Dynamic Network (SDN). The use of a neural network means that we do not have to specify the way in which pixels or pieces of the image are combined or chosen,’ said Hibell.

He added: ‘The network is shown examples of the problem and examples of answers to that problem. It then learns how to perform the processes required to produce the answer.

‘The use of a neural network makes the system very robust and capable of working on very noisy data. It is this adaptability that will allow the network to cope with different levels of motion between frames and other standard issues for image processing techniques such as lighting level and blur.’

Hibell said it should be possible for a trained network to be written on to a chip, which could then be used at almost any stage in a CCTV system. ‘They could be placed in cameras, but that would then mean having to modify them. The ideal place would be back at the control room,’ added Hibell.

‘The existing camera systems could then remain the same. A device containing the network chip could be fed with frames direct from a camera source or from existing storage/processing equipment in the control room.’

In future the network chip could be embedded directly into the camera operators’ console to allow them to enhance a section they are viewing just by pressing a button or clicking on an icon.

Jason Ford