Governments and big business like to indulge in media spin, and that means knowing what is being said about them. But finding out is becoming ever more difficult, with thousands of news outlets, websites and blogs to monitor.
Now a British company is about to launch a software program that can automatically gauge the tone of any electronic document. It can tell whether a newspaper article is reporting a party's policy in a positive or negative light, for instance. Welcome to the automation of PR.
Till now, discovering whether the coverage you are getting is good or bad, negative or neutral has usually meant hiring a "reputation management" firm. Teams of people will read through everything written about your organisation and report on how favourable the coverage is.
As well as being expensive, this can be a long, slow process, says Nick Jacobi, director of research for Surrey company Corpora Software. "There's a massive information overload."
A single news agency may churn out more than eight articles each hour. That's almost 200 stories a day.
Previous attempts to automate this kind of analysis have used one of two techniques. In the first, called machine learning, a program is trained by being given thousands of articles already determined by a human reader to be positive or negative in tone.
But this way can lead to mistakes. For example, if a series of the training articles mentions bomb attacks on a mosque in Iraq, the program may incorrectly conclude other mentions of mosques are also negative.
The alternative is the lexicon approach, in which words are classified as either positive or negative.
But plenty of words can be both. "The plot was unpredictable" and "the steering was unpredictable" differ by just one word. Yet the word "unpredictable" has a positive connotation in the first example and a negative meaning in the second.
And even if that problem is solved, just picking up on positive or negative words can also lead to mistakes, as demonstrated by the sentence: "Everyone told me it was terrible, that I would hate it, but in the end it wasn't at all bad."
So Corpora has come up with a program called Sentiment, which uses algorithms to tease out grammatical components, such as nouns, verbs and adjectives, and identify the subjects and objects of verbs. It can even analyse pronouns such as "it", "he" and "her" to work out what words or concepts they are referring to.
Having an understanding of grammatical structure makes it possible to filter out words not relevant to the sentiment of the article, Jacobi says. The program does not get it right all the time, he admits, but then neither do people. Sentiment was developed mainly for Infonic, a Corpora subsidiary that provides clients with online media analysis of websites, chat rooms, bulletin boards and blogs.
Orlando Plunket Greene, of Infonic, says because the program will list items in terms of how positive, negative or neutral they are, it is possible to skip to the most relevant items.
A person might be able to scan 10 articles an hour, but Sentiment can zip through 10 a second.
Positive or negative?
Tricky phrases that can fool computer programs:
* Looks neutral but is negative: "Why should I bother going to the movie?"
* Looks positive but is negative: "We had a fantastic time. It rained every day and I caught a stomach bug".
* Changes the meaning according to context: "The plot was really unpredictable" v "The steering was really unpredictable".
* Subtle qualifiers: "People haven't had their rubbish collected" v "Only a few people haven't had their rubbish collected".
Program sorts the slanging and singing
AdvertisementAdvertise with NZME.