Saturday, August 28, 2010

Otourly's blog entry translated to English:

A circular irrigation water distributor in Taketa, Oita Prefecture, Japan.Water distribution at Taketa in Japan - public domain - Picture by Tsutsui Mizuki - And it make me curiously think about wikimedia commons' logo...
When we speak about Wikimedia Commons to Wikipedians, not natively anglophone in particular, they say Wikimedia Commons is good but too English-centric, so they use it only for essentials. But, the search tool of the site is too English as well so that it is really difficult for a non-English speaker to find what he is searching for. Briefly, an English search request on Commons will be generally more precise and find more matches than an equivalent request in French.

The true problem is not the English, because there must be a common language to link all the projects, but the problem is the categories.

To explain: the categories on commons are very complicated for someone who is not at ease with English. Often this person will only add one category (one of the main aims of the media collection) or even none to files they upload, leaving the confirmed contributors to clear up according to their English level. And I can assure you that it takes time, and sometimes these contributors would rather take photos, uploading photos, write descriptions, or revert some vandalism... I sometimes wonder how many confirmed contributors have less than 5% of their edit count fixing categorization...

Often my uploads define the categories that I will fill, correct, and, in a word, improve. There is meanwhile a tree view of the Commons Categories that I follow thanks to a little javascript code, created by [[User:Chphe]] called [[SuiviCat.js]]; a French name and as a consequence not a lot used on this anglophone project. I still remember when I requested the adaptation of SuiviCat for Commons...

In fact if we have a look outside the Wikimedia Foundation, I mean on other media websites (such as FlickR), the category system is often replaced by a tag system. Tags are useful, quick, simple... but tend toward disorder, in particular with homonyms. Wrong path, so back to our beloved categories.

In fact categories, to definitively solve the problem, should ideally be 100% multilingual. But this require sysops who are multilingual too. Otherwise vandalisms in Hindi could certainly be more difficult to detect... At the moment its seems to be an unattainable Utopia, but fully multilingual categories would have a big advantage; this could reconcile the different contributors of Wikipedias, Wiktionaries, Wikibooks, Wikinews...

Wikimedia Commons would became a truly international project, the research would be more usable by everybody. For all Wikipedias, each contributor would understand categories and subjects of the media... Besides, often it is the categories alone which best describe a file. For the good reason that not everyone is named Otourly, and as a consequence not everyone everyone links to Wikipedia in the Commons descriptions. Worst, sometimes there is only a minimal description, which does not describe it very optimally.

But it is not the fault of the English language, I must admit it... If we have a look this file for example : [[File:Church_of_the_Nativity_of_the_Theotokos_(Gora_Pnevits)_05.jpg]] only the title and the category are understandable for someone who has only few notions of English. There is no link to Wikipedia in the description which could help us to know what it is exactly.

But reconciling Wikipedia and Commons is possible, anyway, the [[Projet Monuments historiques]] is a good example; who better than French people to take photographs, sort them, geolocalize them (or just geolocalize categories) and offer historical monuments in France to everybody? It is true that these files on Wikimedia Commons are often only described in French, but they are mainly well categorized.

