Although there have been many studies on Wikipedia, little attention has been given to the limits to its growth. As Wikipedia is expanding, it is possible that new concepts are added without having corresponding articles, or that the number of new concepts grows slower than the number of articles. In the first case, Wikipedia's coverage will deteriorate as it will contain articles drowned in an increasing number of undefined concepts. In the second case, Wikipedia's growth may stall. A new study, which my colleague Panagiotis Louridas and I published in the August 2008 issue of the Association for Computing Machinery flagship magazine Communications of the ACM, shows that Wikipedia sits comfortably between the two extremes.
We studied the entire Wikipedia corpus, 485 Gbytes of data, adding up to 1.9 million pages and 28.2 million revisions. Using a suite of tools we developed, we showed that the ratio of undefined to defined concepts in Wikipedia has been stable over time. Furthermore, we found that articles are added to Wikipedia in a collaborative fashion: Wikipedians often add a new article when they encounter a missing entry. Finally, we established that Wikipedia grows in a manner similar to that witnessed in a number of different areas, by having new articles linked to the most popular existing articles. This pattern of growth, called preferential attachment, has been used to explain the number of species per genus, the internet, the world-wide-web, scientific citations, collaboration networks between people, and others. It is the first time preferential attachment has been studied live at a structure of this size.
The article establishes that we can expect Wikipedia to grow and remain usable; as to for how long the process may continue, we close the article by citing Jorge Luis Borges's 1946 short story "On Exactitude in Science". The wise men of the empire undertake to create a complete map of the it; upon finishing, they realise the map was so big it coincided with the empire itself.Comments Toot! Tweet
Last modified: Wednesday, July 30, 2008 7:35 pm
Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.