Language analysis tool to ascertain age and gender

In recent years, there has been a rapid rise in the number and use of online social networks. These social networks pose two significant risks in terms of child exploitation: paedophiles predating on children in chat rooms and distributing and sharing child abuse media through online file sharing.

Now, computer scientists at Lancaster University have been working on a tool that can work out a person’s age and gender using language analysis techniques. They hope it will eventually be used to help police and law enforcement agencies spot when an adult in a chat room is masquerading as a child as part of the victim ’grooming’ process.

For several months, groups of children and teenagers from the Queen Elizabeth School in Kirkby Lonsdale, Cumbria, have been taking part in experiments designed to provide the researchers with exactly the kind of informal web chat they need to help improve the accuracy of their software.

But the 350 students have also unwittingly been taking part in an experiment to find out if they know when they are talking to adults posing as children online.

So far, the results show that even pupils as old as 17 struggled to tell the difference between a child and an adult and approximately four out of five thought they were chatting to a teenager when, in fact, it was an adult. Girls were better at telling the difference than boys – 22 per cent of the girls guessed correctly, while only 16 per cent of the boys did.

The computer software did significantly better, correctly working out whether web chat was written by a child or an adult in 47 out of 50 cases – even when the adult was pretending to be a child.

Researchers believe that, eventually, the software could be used not only to identify adults posing as children but also to pick up on the ’stylistic footprints’ of paedophiles and trail them as they move around the internet.

Lead researcher Prof Awais Rashid of Lancaster University’s Department of Computing said: ’Paedophiles often pose as children online and our research indicates that children don’t find it easy to spot an adult pretending to be a child.

’In our analysis, we found that four out of five children across the school got it wrong. Interestingly, the strategies they use to detect who they are talking to are also the ones that lead them to make wrong decisions – they rely on the subject matter, the use of slang and even something as simple as whether the individual said he or she was an adult or a child. This really highlights the need for a safety net of some sort.

’We hope to develop an automated system that can pick up on quirks of language particular to a certain age group. These language patterns can help us to expose adults that seek to groom children online by posing as children in chat rooms, for example.’

The work is part of the Isis project, which is funded by the Economic and Social Research Council and the Engineering and Physical Sciences Research Council.