All languages skew towards happiness: Study
All human languages - be it English, Arabic, Russian or Korean - tend to use positive words more frequently than negative ones, a new study has found.
Washington: All human languages - be it English, Arabic, Russian or Korean - tend to use positive words more frequently than negative ones, a new study has found.
"Put even more simply, humans tend to look on (and talk about) the bright side of life," researchers said.
A team of scientists at the University of Vermont (UVM) and colleagues used a massive data set of many billions of words, based on actual usage.
"We looked at ten languages and in every source we looked at, people use more positive words than negative ones," said UVM mathematician Peter Dodds who co-led the study.
The study indicates that language itself has a positive outlook. And, therefore, "it seems that positive social interaction," Dodds said, is built into its fundamental structure.
Scientists gathered billions of words from around the world using 24 types of sources including books, news outlets, social media, websites, television and movie subtitles and music lyrics.
"We collected roughly 100 billion words written in tweets," said UVM mathematician Chris Danforth, who co-led the research.
From these sources, the team then identified about 10,000 of the most frequently used words in each of 10 languages including English, Spanish, French, German, Portuguese, Korean, Chinese, Russian, Indonesian and Arabic.
Next, they paid native speakers to rate all these frequently used words on a nine-point scale from a deeply frowning face to a broadly smiling one.
From these native speakers, they gathered five million individual human scores of the words. Averaging these, in English for example, "laughter" rated 8.50, "food" 7.44, "truck" 5.48, "the" 4.98, "greed" 3.06 and "terrorist" 1.30.
A Google Web crawl of Spanish-language sites had the highest average word happiness, and a search of Chinese books had the lowest, but all 24 sources of words that they analysed skewed above the neutral score of five on their one-to-nine scale - regardless of the language.
In every language, neutral words like "the" scored just where you would expect: in the middle, near five.
And when the team translated words between languages and then back again they found that "the estimated emotional content of words is consistent between languages."