Skip to main content

Wikimania 2011 : Are Internet Sources Reliable?



Image source: Wikimedia


On August 4-6 I participated in Wikimania 2011 in Haifa! The participants were Wikipedian from many countries, including Wikipedia's founder Jimmy Wales and all of Wikimedia Foundation directors, except one.
The organizing team headed by Tomer Ashur, Wikimedia Israel's chairman and Deror Lin, Wikimania General Manager, achievements were exceptionally good both in the administrative aspects and the professional aspects. 


Wikipedia is evolving and changing. It is not exactly the same community it was when I described it in one of my Web 2.0 for Dummies posts in 2008. However, the challenges described few years ago in my post titled: Wikipedia: The Good the Bad and the Ugly did not vanish. 


Wikipedia case
The evolution and challenges of Wikipedia and Wikipedia's Community are very similar to challenges facing other Web 2.0 projects and communnities, including Social Networking Services. However, there are also unique aspects as well.  


Not all Internet Sources were created Equal
There is growing tendency in Wikipedia to ask for Sources supporting facts or so called facts, mentioned in Wikipedia articles. 


If no supporting sources are cited a big box stating that sources are missing is inserted under the article's title. The rational is to avoid of wrong information inserted deliberately or by mistake.


There are cultural differences between different languages in Wikipedia. The English Wikipedia is the most demanding supporting evidences (Sources). 


I doubt if relying on Web Sources is good enough. 
The reason for my doubts is based on a well known fact about Web Sources: The variability of the Reliability of the immense information in the Web is high. Some sources are trustful others are just nonsense.


So what is the Value Proposal of five or ten or fifteen sources if none of them is Reliable?


What is the Added Value of a Reliable but very superficial sources, such as some of the articles visible in some of the Electronic Newspapers ?  


The real problem is that in any Web related task the user has to sort them and assign to each source a level of Reliability. It is not an easy task. 
Many people are not able to perform this task properly if the subject matter is not included in their expertise.  


This observation is valid for Wikipedian as well as for non-Wikipedian. 


To illustrate this issue I wrote a new article in the Hebrew Wikipedia on an important Information Technology technical subject.


I was able to write a good article without using any source.


How was it possible?
My professional experience includes actual selections and implementations of technical products addressing this subject.


In order to do my job I read technical material and participated in Vendors' Presentations. I also read Analysts' Research Notes.
However, that was few years ago and I had no access to White Papers and Research Notes I read.


Finally, I decided to add only one Web based source. It was an article written in the English language by a University Professor considered as a leading expert or Guru in this area. 
Few years ago I read an impressive book he wrote on the same subject.


I also read the article on the same subject in the English Wikipedia. It includes about ten sources. However, none of them was a valid source (as Wikipedia articles are a collaborative work it may be now a higher quality article including valid sources).


I read all sources cited in the English Wikipedia and decided that none of them is good enough for citation.    


The Bottom Line: Not the best article based on multiple non-trusted sources in the English Wikipedia, edited by somone with no expertise in the subject and no ability to assess sources Reliability.
A lot better article in the Hebrew Wikipedia based on expertise and experience and a single trusted source.


Is there a better way?
I am sure that it is possible to partially formalize an algorithm for assigning Reliability score to Web sources.


In my opinion, it is also possible to automate the algorithm by dedicated software.
Automation will probably be partial automation and the heuristic algorithm will surely not be 100% precise.


Rating of Academic sources is possible and rating of Search Results by PageRank algorithm and other algorithms is also possible. So why not automating a similar task such as rating of sources for Wikipedia articles (and may be generalizing it to other contexts requiring relying on Web Sources)? 


I guess that in addition implementing a Bayesian Probability algorithm for improving the rating of sources could be deployed as well.      






Comments

Popular posts from this blog

The mainframe: still alive and kicking

Recently, I was interviewed by  Pcon   (unfortunately the link points to an Hebrew only site) as part of debriefing on Legacy Systems.  Pcon is an Israeli company investigating IT topics by quoting professional articles and interviewing experts. They publish the results of the investigations including practical recommendations. This post is mainly about topics raised by me during the interview, but not included in the debriefing, which will be published.    What are Legacy Systems? The term Legacy Systems refers to old application systems and/or veteran technologies still in use.  Usually, the term Legacy Systems is associated with: 1. Mainframe Hardware e.g. IBM System z and its Operating Systems or Proprietary Servers and Operating Systems such as HP Alpha and OpenVMS Operating System, IBM AS/400 and OS/400   Operating System. 2. Development and Production Environments, e.g. COBOL , Natural and DBMS systems such as Adabas  ...

Will Business and IT Aligned?

For decades we are talking about closing the gap between business and IT , but the gap is still as wide as it was. In the beginning of the ERP era, we focused on aligning Business Processes and Core Systems, but in most enterprises we failed. SOA was the next alignment promise: defining the SOA Services in Business boundaries instead of Technical boundaries, should narrow the gap. However, despite of SOA Business Value ( Agility and Reuse )  in most enterprises,  the large Business-IT Gap remained as large as it was.  The IT Community aimed at the next alignment attempt: SOA is technical and BPM is its Business related complement.  Will the current BPM based alignment attempt succeed? I do not know, but Nick Heath's article  titled: Stop doing what the vendors tell you, CIOs told , published in  Tech Republic , suggests that the root of the problem is not Technological .   Stop Doing What the vendors Tell You Nick Heath's article is based ...

Vendors Survival: Will Software AG Survive until 2019?

This post is another post in the Vendors Survival series following posts on Microsoft , Google , HP , Sun and EMC . On July 14 th Software AG and IDS Scheer announced that Software AG is going to take over IDS Scheer . The intended acquisition is an opportunity to add another post in my Vendors Survival posts series. A brief history of Software AG Mainframe products Software AG is larger than any German software company except SAP . It was established in the Mainframe age (in 1969). I worked with many customers, who used and some of them are still using, its two flagship products Adabas and Natural . Although these products support many platforms, their main platform is IBM Mainframe. Adabas is a database and Natural is a development environment. Like other pairs of Database and Development Environment in the mainframe environment (e.g. Ideal and Datacom , Mantis and Supra) build by the same vendor, they are tied together. As a result, although it is possible t...