Internet Geographer

Blog

Article Quality in English Wikipedia

Expanding on the maps of Wikipedia quality (i.e. the maps of South Asia and the Swahili version of Wikipedia) posted earlier on this blog, I want to offer a visualisation of all articles on the planet shaded according to the number of words in each article. In the map below, yellow dots represent the location of relatively short articles (such as the “Jericho Tavern”) in the English version of Wikipedia, while red dots indicate the location of relatively long articles (for instance, “Penzance”). A high-res version is also available here (I highly recommend downloading it and exploring in some detail).


Interesting patterns emerge: the average word length of articles in the US is 750, while many European countries have lower means: e.g. Italy (550), Germany (439), Spain (397), France (260), and Poland (233). But it is also noteworthy that a few European countries do have means more similar to the US. Articles in the UK and Ireland they average 687 and 749 words respectively. The immediate conclusion here should be that it is easier for editors in English speaking countries (all of which tend to have high averages) to expand articles than editors in countries in which English isn’t the native language. 

But the native language of a country clearly isn’t the only factor at play. The countries with the highest average word counts to their articles are (this list excludes small islands and city states): Iraq with an average of 1091 words in its 538 articles, the Philippines with an average of 1085 words in 2736 articles, and North Korea with an average of 947 words in its 292 articles. 

That’s right: out of a list of over 200 countries, North Korea has one of the highest average word counts for its Wikipedia articles!

On the bottom end of the scale we have Azerbaijan (159), Estonia (209), and Kenya (223). 

The results tell us that there are apparently a lot of stub articles written about Azerbaijan, Estonia and Kenya (e.g. the Bukhungu stadium). Whereas there are very few stubs in places like Iraq and North Korea: a finding that makes a lot of sense. It must be very hard for English speaking editors to create articles (even stub articles) about things like small stadiums in provincial towns in North Korea and Iraq. But uploading this sort of information about the equivalent type of place in Estonia or Kenya is far less of a problem. 

There are clearly a lot of (locally-specific) factors at play here that will explain some of the patterns that we are seeing, and we are looking at how a range of metric (e.g. literacy, computer access etc.) correlate to these data. In the meantime, any thoughts or comments are welcome in the comments field below.

More regional maps will also be up on the blog soon…