Sound cue CCTV

Portsmouth University researchers are using sound cues to identify potential crime in progress and trigger CCTV cameras to turn towards the source.

CCTV surveillance has enjoyed much success in identifying perpetrators after a crime has taken place, but does not always catch it in progress. It can also be difficult for a security officer in a control room to keep an eye on several cameras at once.

The three-year project, sponsored by the EPSRC, aims to adapt artificial intelligence (AI) software developed by Neuron Systems, a spin-out founded by Portsmouth University’s Dr James Hui, which currently identifies visual patterns.

Dr David Brown of Portsmouth’s Institute of Industrial Research said: ‘This software identifies salient features in objects, such as if a car has got an aerial up, or a wing mirror in a certain position, or a dent. For the next stage, we want to get the motorised camera to pivot if it hears a type of sound. So in a car park, if someone bashes in a window, it turns to look at them.’

Brown said the system would not identify a specific speech pattern, but would use fuzzy logic to identify a type of noise, such as a crowd or breaking glass. The researchers will take the software developed by Neuron Systems and adapt it to sound cues.

‘We are looking for templates of sound — riser response, shapes of sound,’ he said. ‘If you close your eyes, you can imagine running your fingers along a profile. In the same way, we can look at the shape of a sound profile, and from the shape, say it is the same shape as breaking glass, for example. So it’s a very fast, real-time method of doing it.’

The project builds on the team’s previous study of waveform shapes. The software will fit an AI template to the waveform and use fuzzy logic if the fit is not exact. For example, different panes of glass breaking will have different waveforms but the same generic shape.

A key challenge for the project is that the system needs to respond to an anomaly in real time, on a scale comparable to human response time, which is about 300 milliseconds. The software will identify a problem and instantly swing the camera in that direction, just as a person would turn their head if they heard a scream.

The software will work alongside CCTV-based human motion analysis that has been developed at the same institute.

‘If the camera is pointing in a direction because an aggressive sound has been identified, the motion software can identify whether a person is punching another or running away from the scene,’ said Brown.

‘In a similar way, it can look at a template of a body shape carrying out an action, so it could spot the difference between reaching out to pick a jumper off the shelf or punching the assistant, or spot someone running away in a football crowd.’

Another problem the project could help overcome would be having to search through hours of security video to identify a specific object or action. Instead of having to watch an eight-hour tape looking for a white van, for example, the software could identify the object quickly on- or offline. When combined with sound cues, the camera could be assured to be pointing in the right direction.

It could also make life easier for camera operators. ‘If you sit and look at a camera in one of these control rooms in a council office and you move it, it’s disorienting,’ said Brown. ‘Panning cameras manually is not as easy as it appears. It’s like looking through a telescope, then swinging it to another location — you don’t know where you’re looking. But if you’re looking at the sound being generated — a car being broken into, a kid shouting — you know you’re looking at an important scene.’

The potential market for software incorporating the algorithm created from the research could include local councils, private security firms, car parks, shopping centres, football stadiums and public transport.

By the end of the three-year project, the teams hopes to have generated algorithms that can be incorporated into a commercial software suite, with each generation of algorithms becoming more sophisticated as the project progresses.

‘Because it’s AI, the longer it’s in the software, the more it learns,’ said Brown. ‘The later versions will get cleverer as time goes on, perhaps identifying certain words being said or violent sounds.’

Berenice Baker