Sunday, October 30, 2011

Vendors Survival: Will Google Survive until 2021? - Part 2

Searching the Web 
Search Engine is Google's most strategic product. Some of its other products are using it as component of the services they provide. 
Limitation or failure of it, as described in Why we desperately need a New (and Better) Google cited in part1 of this post, could negatively affect Google's market position.


This is a major challenge facing Google. Google has no control of some of the factors limiting its effectiveness.


Challenges in Searching the Web
The following bullets describe major Web Data problems, which could affect Web Search:

  • The data is growing exponentially
The amount of data is growing fast. A Search Engine should handle larger amounts of pages and probably will find more and more number of pages in each search operation results list. 


Implications: Users will not read all entries in search results list. They will focus in the first entries. Search Engines results order should be accurate so users could identify the most relevant and important pages.


  • No data Clean up Procedures    
Think of PC Files, Server Files, e-mail messages and 
Userids and Userids related information as well as other IT entities. 


The common denominator of all of them is the necessity to manage them and to move to garbage items which are not in use and probably will not be used in the future. For example, you have to avoid of Virtualization Sprawl by deleting unused Virtual Machines in a Cloud, you have to delete temporary files in a PC or a Server. Spam, old or not important e-mail messages are deleted. Other e-mail messages are archived. Userids and their related information are deleted or at list should be deleted when the employee or client stops working. 


Deletion and other Management Operations are done automatically or manually, but in all cases there is an owner or an administrator, who devises and performs a policy.
No one manage the Web. No one cleanup web pages except the owner of the page, who usually has no motivation for deleting web pages.  


Implications: The result of high growth rate coupled with no management is too many Web pages and too many irrelevant pages. 

  • Data Reliability 
Some pages are reliable and others are not. It is not an easy task to identify the reliable and valuable pages. For information read a previous post: 


Wikimania 2011: Are Internet Sources Reliable?

Search Engines should be able to assess the reliability of Web pages. It is not an easy task.


Implications: Reliable Web Pages should appear in the top of Search Results; otherwise users will read mostly unreliable sources.
  
  • Multi data types
The data is not text only. It includes Videos, Pictures and Voice etc. 
Non-text data size is a lot larger than text data size.   


Implications: Searching non-text pages is not as easy as searching text. The Search Engine has to support new search types e.g. search of Images or Videos or even multi-type search e.g. text or images sharing a common Keyword.    


Summary
The Web expansion or Entropy or exponential data growth is a major challenge for applications manipulating Web Data including Search Engines.
Semantic Web, Web 3.0, Big Data are attempts to address or minimize the effects of this problem.


The other main challenge is rating the Reliability and Value of Web Pages. Page Rank is an example of an algorithm for addressing this challenge. 


Unique Google's Challenges
In addition to the need to cope with these general challenges Google has to cope with unique challenges, due to its position as the Search market leader with approximately 80% market share.


The challenges are not as simple as using an automated or human method of multiple clicking on advertisement. They are derived from attempts to fool Google Search Engine algorithms in order to move a Web page to the beginning of search results.


The following bullets provides three frequent mechanisms for fooling Google Search Engine or at least attempting to fool it:  


1. Payment for an automated service which supposed to place a Web page in one of the first entries of Google's Search results list. 
The paid service will access the Web page artificially many times as possible, preferably from highly ranked Web sites. This method could improve the Page Rank rating and place it in higher position than it should be positioned in Google searches.


2. Adding unrelated popular labels to a Web Page labels list.
This technique could show a Web Page in Search operations unrelated to its content. It may also position it higher than it should in Search results list.   

3. Cutting and Pacing full Web Pages or parts of Web Pages from other Web sites
For example, by copying a Wikipedia article content. 
The page design may look perfect and the content could be reliable, but it attracts readers to a Web Site in which other pages are not reliable and are not well designed. Google's Search algorithm could rate the page higher than it should.
Google's Action: The Company announced recently that it will pay more for clicking advertisements in sites   having original content.  

Conclusions
Google's survival depends mainly upon two related domains: Web Search and its effective advertisement based business model.


The Web Search Engine is not as good as it was few years ago, due to Web Data exponential growth and fundamental changes of data characteristics. In addition to these factors, due to Google's dominance of this market users develop various mechanisms for fooling Google Search Engine. 


Google challenge is to adapt and evolve its Search Engine. Google may need to find new creative ways to evolve its search Engine because nobody has Web Data Control, so the data quantity and characteristics could change significantly in the future. 


As far as the Business model is concerned, it will be very difficult to replace it by entirely different model. However, the probability that an advertisement based business model will not be viable is low.


If the advertisement based business model will continue to be a viable model, Google's position will depend upon using it effectively in Search Page in conjunction with creating additional income sources from advertisement related to new services and Business Lines such as Android and You Tube TV.
     

No comments:

Public Cloud Core Banking: Hype or Reality? - Revisited

  More than 4 years ago I was asked if Public Cloud Core Banking is a Hype or a Short Term Reality? If you had read the post, you would prob...