Webology, Volume 5, Number 2, June, 2008

Home Table of Contents Titles & Subject Index Authors Index

Search Engines and Power: A Politics of Online (Mis-) Information


Elad Segev
Research Institute for Law, Politics and Justice, Keele University, UK
Email: e.segev (at) keele.ac.uk

Received March 18, 2008; Accepted June 25, 2008


Abstract

Media and communications have always been employed by dominant actors and played a crucial role in framing our knowledge and constructing certain orders. This paper examines the politics of search engines, suggesting that they increasingly become "authoritative" and popular information agents used by individuals, groups and governments to attain their position and shape the information order. Following the short evolution of search engines from small companies to global media corporations that commodify online information and control advertising spaces, this study brings attention to some of their important political, social, cultural and economic implications. This is indicated through their expanding operation and control over private and public informational spaces as well as through the structural bias of the information they attempt to organize. In particular, it is indicated that search engines are highly biased toward commercial and popular US-based content, supporting US-centric priorities and agendas. Consequently, it is suggested that together with their important role in "organizing the world's information" search engines reinforce certain inequalities and understandings of the world.

Keywords

Search engines; Information inequalities; Misinformation; Structural biases; Politics of online information; Google Earth



Introduction

Realizing the potential of the Internet as a source to acquire knowledge and through that social, political and economic advantage, many scholars attempted to point out the factors that lead to certain inequalities and the emergence of the so-called "digital divide" (Anderson, Bikson, Law, & Mitchell, 1995; Holderness, 1998; Norris, 2001; DiMaggio et al., 2001; Hargittai, 2003; Lizie, Stewart, & Avila, 2004; Barzilai-Nahon, 2006). The main concern of this paper is to examine the role of search engines in constructing information inequalities and in framing our knowledge in specific ways. It is argued that popular search engines and Internet companies like Google, MSN and Yahoo increasingly become dominant information agents (Hopkins, 2007) and hubs that link between various actors. They have gained worldwide popularity by producing, organizing, distributing, customizing and manipulating online information. These companies have acquired a strategic position particularly because they address the insatiable need of the information society - the need for immediate and relevant information. Governments, organizations, companies and individuals increasingly depend on obtaining and producing immediate and relevant online information to fulfil their everyday tasks. Thus, both information providers and information retrievers often turn to search engines in order to support their functions and gain or retain various advantages.

It has been suggested that 40 percent of the visits to websites in the US are search-specific visits (Hitwise, 2004). Another source shows that search engines are one of the major ways to reach websites (Hopkins, 2007). This paper focuses mainly on Google, the leading search engine and one of the most visited Internet properties nowadays, which has more than 500 million unique users1 every month all over the world (Comscore, 2007). It is suggested that the tremendous popularity of Google and similar search engines (hereafter: information agents) elevates them to be "authority sites" (Hurlbert, 2004), with the power to channel the information flow and influence our knowledge in specific ways.

Search engines in general and Google in particular empower online users and information producers to retrieve more relevant information and reach greater audiences, but they can also increase the gap between online users based on education, information skills and financial resources. From the bottom-up view, politics of online information is the ability of individuals to exercise information skills and utilize search engine interfaces in order to find relevant information or to be found online (Introna & Nissenbaum, 2000). From the top-down view, it is the operation of corporations and organizations that own "authority sites", which provide, organize and customize online information, and their interactions with governments and states. Together these dominant actors shape our information society, and thus their strategies and tactics require investigations.

This raises the question of technological and information inequalities. Although it may be presumed that individuals nowadays have much more power and freedom to obtain and produce diverse views online, various examples in this paper will demonstrate a more complex picture. The variety of online services, which is introduced by information agents in order to enhance their customization (e.g. Google News, Google Scholar, Google Maps or Google Toolbars) and provide their users with web-spaces to produce their own content (e.g. Gmail, YouTube, Picasa and web-log services), creates dependency and progressively "locks" users on specific websites. Moreover, the ability of information agents to collect and store personal data and extend their customization power potentially threatens the privacy of their users. This is to suggest that together with greater empowerment and control for their users, information agents also increase their own abilities to control the information flow and its uses.

In order to examine the politics of search engines, this paper follows their short evolution from small companies to worldwide media oligopolies, examines their different practices and operations and analyzes the content they produce. By doing so, it also identifies some of the important actors (i.e. states and organizations) of the information society, placing search engines in a central position within this global network with major political, social, cultural and economic advantages. It is argued and indicated that the dominant American search engines tend to commodify online information and intensify the asymmetry of information flow worldwide, supporting the growth of mainstream, commercial and very often US-centric information. It is therefore suggested that together with their important role in organizing the Web, search engines reinforce certain inequalities and understandings of the world.

The Rise of Internet Search Engines

The history of information retrieval goes back to the invention of writing. Although retrieving information from papyrus scrolls was not always efficient, the Greeks and the Romans developed various methods like table of contents, hierarchical chapters or alphabetization systems, which were used, for example, in the famous library of Alexandria (Skydsgaard, 1968). The term "index" was used in this context to refer to the little slip attached to the papyrus scroll, in which the title and sometimes the name of the author were written. Later the term was extended to refer to a list of titles. As long as information was stored in papyrus scrolls, it was difficult to indicate the exact position of the content within a scroll. Only with the advent of paper and then printing, where information was stored in identical book copies, it was possible to add page numbers to the index, thus greatly facilitating information search (Wellisch, 1991).

During the 1950s the technical potential to store information in digital media encouraged scientists to develop automatic mechanisms of search and information retrieval based on the previous librarian model. In this model, the user who has some initial information needs, fashions a request as a query, and the system returns a list of relevant documents (Ramana, 2004). One of the greatest challenges of this model is that users must reduce their information needs to a search query, and search mechanisms are supposed to "guess" from this search query the users' requirements. Automatic search systems are required to build an index that links search queries with relevant documents, and therefore extract, summarize, classify and eventually visualize content in friendly interfaces. To that end, it is always possible that during this complex process users will translate their information needs into inadequate search queries, or that the system will omit relevant documents from its index or its search results.

The first search engine, Archie, was developed in 1990, and was based on downloading the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, thus creating a searchable database of filenames. A year later Gopher was introduced, enabling online users to search within the content of plain text files. In 1993 various crawl-based search engines, also known as "robots", "spiders" or "crawlers", were developed (e.g. Wandex, WebCrawler and later commercial ones such as Lycos, Excite, Infoseek, Inktomi, Northern Light and AltaVista). The basic principle of those search engines, which is still common nowadays, is to follow hyperlinks from one website to another and retrieve their content, building an index that connects keywords or search queries with URLs (Battelle, 2005).

Unlike human mediated information search (e.g. librarian assistance), in automatic search systems the potential of users to find relevant and valuable information depends almost entirely on their information skills, and even more so on the classification and indexing algorithms. Search engine companies, which develop the code and are therefore responsible for the operation of the search mechanisms, are constantly required to enhance their index mechanisms, increase their information coverage and gather more personal information about their users in order to customize their search results.

Accessing the Deep Web

Search engines cover however only part of the World Wide Web. Computer scientists have defined online information that cannot be accessed through search engines as the "deep web". For example, Google (with the greatest search power among popular search engines) can search in billions of documents, yet this is estimated to be a small part of the entire Web2. Since search engines build their index by following hyperlinks, they can most of the time reach only static webpages that have links to other webpages. Their crawlers cannot reach dynamic database webpages, which have no links, and can be generated only by search queries within the webpages themselves.

Most of the deep web is searchable databases like e-commerce websites, library catalogues, phonebooks, law databases, and so forth. Another kind of content that technically cannot be included in search engines is password-protected websites (e.g. e-journals or premium content that require subscription). Finally, there is a lot of information on the Web in PDF, Flash, Microsoft Word, Power Point or image formats, rather than in HTML format, which is the basic Web language. Some search engines choose to exclude this kind of information, because of the complexity to index it.

One of the main problems in indexing the deep web is the fact that the Web is constantly growing. At the same time that search engines enhance their ability to index the deep web, there are many more new webpages added to the World Wide Web. Since the Web is currently growing much faster than the search engine index (Barabási, 2002), the deep web is constantly growing as well. While search engines can cover only a small part of the entire Web, they tend to emphasize the quality (rather than the quantity) of search results and the ability to customize online information. This trend makes search engines very useful in finding popular information. However, search engines are a less appropriate tool when it comes to finding very specific, relatively new or esoteric information, which may be found only in the deep web.

Together with the growing depth of the Web, new search engines and searching tools emerge. The idea of these new search engines is to integrate valuable and contentful databases, and to provide cross search services. Complete Planet, for example, is a special search engine that performs cross search in more than 70,000 database websites. Similarly, there are number of domain specific "vertical" search engines, such as GlobalSpec that crawls only in engineering websites and databases to provide comprehensive catalogue-based information about engineering parts (Battelle, 2005). At the same time, the popular "horizontal" search engines (i.e. Google, Yahoo and MSN) constantly develop their technologies to include more searchable databases and integrate into their search results more clusters of unlinked webpages. Thus, Google has launched Google Scholar, Google News, Froogle and Parcel and Patents tracking, which can perform search within specific databases. Similarly, in March 2004 Yahoo launched Content Acquisition Program, integrating more valuable content to its search results from databases of non-commercial sites such as the National Public Radio or the UCLA's Cuneiform Digital Library Initiative (Sherman, 2004).

The deep web may increase the gap between online users, since in order to reach information without the help of search engines they are often required employing extra financial resources and information skills. They have to be aware of the existence of particular websites so they could reach them directly, pay membership fees for password restricted information (e.g. e-journals), use specific search engines that harvest the deep web (e.g. Complete Planet or GlobalSpec), or have an extensive knowledge of the code and the Internet infrastructure (e.g. hackers). However, indexing the deep web does not necessarily mean equal opportunities for all online users. Even if eventually the deep web becomes a part of the search results, the strict and discriminating rules of search engines will still apply (i.e. page ranking mechanisms, money promotion deals and greater dependence).

In order to enhance their search engine ranking and achieve greater audience reach, website owners increasingly have to give up their privacy and independence, providing the search engine corporations with greater power and control. In February 2006, for example, Google launched Google Desktop 3, software that enables users to search for personal files and e-mails within their own computer, integrating them with the Web's search results. A couple of weeks later a leading US digital rights campaign group has warned against using this software, since it posed certain risks to privacy. Some features in the software enable Google to store private files, e-mails, chats, and search history on its servers. Marissa Mayer, Google's vice president of search products and user experience, admitted the problem: "We think this will be a very useful tool, but you will have to give up some of your privacy. For many of us, that trade off will make a lot of sense" (BBC News, 2006a). Freedom of information is often in conflict with privacy, and the trade-off or balance between the two is a major challenge for the information society. On the one hand, greater access to online information may provide opportunities and advantages to a variety of actors. On the other hand, it may pose a threat to privacy and therefore also to the broadly conceived security of individuals.

Being in the middle between governments and individuals, search engines are often used by governments to track down unlawful users and to limit their freedom of information. The Chinese Government, for example, used e-mails from Yahoo to catch and imprison a local journalist (Nystedt & McCarthy, 2006). Similarly, the Brazilian and Indian police used Google to identify online users who posted illegal information in its social network service, Orkut, advocating violence, crime or human rights violations (Fox News, 2006; Chowdhury, 2007).

These examples suggest that freedom and privacy of information are constantly contested. May (2002) suggested that many individuals tend to give up their privacy in order to gain greater economic advantages (through companies' customization of services and products) and better security (through government surveillance). One example is the ability of search engine companies to keep the search history of their users. This information may be beneficial for users who get more customized and relevant search results (like on Google Personalized Search). Yet, it is even more beneficial for the search engine companies, which can learn a lot about their users' habits, preferences and interests. With the continuous developments in information and communication technologies, companies can gather a great volume of personal information about the consumption habits of their customers (through their consumption history and credit records) and even about their movements in space (through the records of mobile phone movements). Although the increasing capacity of companies to monitor personal life can pose a threat to privacy, it is still relatively unrecognized and often perceived as unproblematic (May, 2002), since the benefits of improved services seem to outbalance the risks.

Search Engine Competition

There is a constant competition between search engines to control the online information market by covering more parts of the deep web. In 2005 Yahoo has made a small part of the deep web searchable by launching Yahoo Subscription. It has contracted with some popular e-journals like the Wall Street Journal Online and IEEE to provide premium content for a fixed rate to its subscribers (Sherman, 2004). Similarly, Google has developed relations with publishers, aggregators and libraries and launched Google Scholar, which enables searching for online scholarly literature. In order to read many of the articles listed on Google Scholar results, online users are required to use password or subscribe to the publisher's services. Nevertheless, online abstracts and databases, which were previously part of the deep web, are currently indexed, thus gradually become available for all, offering a glimpse of what is available to subscribers. Google Books, a project of scanning library books and making parts of them searchable and available online, exhibits even further the process of creeping information spaces.

As part of the expansion race, Google is currently looking to capitalize on the fast-growing online population in the Middle East and North Africa. It is setting up offices, hiring staff and translating its interfaces (Google News, Google Maps, Google Scholars, Gmail, etc.) into Arabic (Wallis, 2006). Thus, in June 2006 Google has launched an Arabic version of Google News. The ability to search within Arabic news resources is another quest to make the deep web accessible and searchable. However, indexing the deep web with Google News may further polarize the information society. Pulling information from various Arabic news sources,3 Google News may undermine the significance of national and regional news sources, intensify more popular political trends, and further marginalize alternative views in the Arab world.

An interesting initiative was taken by the French and German Governments to develop the next generation of search engines. The project, called Quaero, was conceived in April 2005 to challenge the dominance of American search engines in general and Google in particular, aiming to increase the European role in the production and distribution of online information. As the former French President, Jacques Chirac, put it: "We must take up the challenge posed by the American giants, Google and Yahoo. For that, we will launch a European search engine, Quaero" (O'Brien, 2006).

One of the main ideas behind this half billion-euro project (Chrisafis, 2006) is to widen the search technology and also include multimedia search. It is meant to use techniques for recognising, transcribing, indexing and translating audiovisual documents in several languages. Later, online users will be able to find patterns, shapes and colours of images, as well as words and sounds from songs or movies. Moreover, Quaero claims to be able to provide greater control over copyrights, intellectual properties and cultural-heritage. The Quaero project suggests that governments and political leaders have realized the tremendous impact of search engines on cultural, social and economic matters. In his speech Chirac argued that "culture is not merchandise and it cannot be left to the blind forces of the market. We must staunchly defend the world's diversity of cultures against the looming threat of uniformity" (Litterick, 2005).

The project was criticized as being very small in terms of budget and technological capacity compared with similar projects of Microsoft or Google. Some search experts called it "a blatant case of misguided and unnecessary nationalism" and warned that by the time Quaero is developed the market will have moved on (Chrisafis, 2006). Finally, there were fundamental disagreements between France and Germany regarding the use and technical aspects of this search engine. While France favoured a multimedia search engine, Germany favoured a text-based search engine. As a result, many German engineers refused to be associated with what they thought was becoming an anti-Google project, rather than an innovative and independent project. In December 2006, the German Government withdrew from the project to focus on its own domestic search engine, called Theseus (O'Brien & Crampton, 2007). Similar projects for multimedia search are being currently developed by other governments (e.g. the Norwegian Pharos) and backed by funds from the EU, which has recognized the increasing importance of search engines.

Nonetheless, the economy of scale is a very important factor in the growth and success of search engines. Similar to other initiatives taken by international organizations such as UNESCO, ITU and UCLG to tackle the information inequality and prevent US-dominance (which were limited in success and often overshadowed by commercial forces), it has been suggested that any new type of search engine will either be acquired by one of the bigger American hubs or become commercialized in order to support its competitive position. Consequently, it will not be able to sustain information equality and diversity of alternative views, and will mainly represent the views of the richer and more popular nodes.

Regulation and Manipulation Online Information

Google provides global and local online information in 112 languages. Together with technical adjustments, international agreements helped Google expand its information services all over the world. The global reach of Google and the open access to information that it provides have often been at odds with national policies. Thus, global information inequalities can also result from Google's need to comply with local laws and practices. One of the better-known conflicts has been that between Google and the Chinese Government. It started rather contingently in 2002, just before the 16th National Congress of the Communist Party of China (CCP), when Jiang Mianheng, the son of the former President Jiang Zemin, visited the No. 502 Research Institute of the Ministry of Information Industry (MII), attending a demonstration of the second-generation broadband Internet and its high-speed search facilities. In order to please him, an engineer typed the name of his father "Jiang Zemin" on Google search engine, and surprisingly one of the first results was titled: "Evil Jiang Zemin". Jiang Mianheng immediately ordered the blocking of Google's website in China (Tianliang, 2005).

However, two weeks later on 13th September 2002, due to public pressure from the thirty million Chinese Google users at that time, China ended the ban and Google was available again in China. Yet users reported broken links on Google's results when they searched for sensitive information such as that on Tibet, Taiwan, President Jiang Zemin, Falun Gong, and the Tiananmen Square revolts. The Chinese Government has never publicly announced its intention to ban Google and other popular search engines (such as AltaVista), or to re-open access to some of them (Deans, 2002). Similarly, Yahoo China excludes search results of anti-governmental nature and even information on the democratic system. A person who is located in China or uses Chinese versions of popular search engines will still find it difficult to retrieve information about the Tiananmen Square revolt or the Falun Gong.

Censorship and government control is not restricted to non-Western states. For example, in January 2006 the US Government required the biggest American search engines to provide a list of all search queries conducted over a one-week period, as well as a random list of one million websites that appeared in their search results. The US Government claimed it needed this list in order to revive the 1998 Child Online Protection Act (COPA). Yahoo, America Online and Microsoft immediately complied with the request. Only Google initially refused to provide this data, claiming it would violate the privacy of its users and reveal company trade secrets (Walker, 2006; CBS News, 2006). Three months later, in March 2006 a federal judge ordered Google to turn over some of the records demanded by the government, which it did (Schmidt, 2006).

In some cases, search engines may function themselves as summary courts, administering Internet justice. A punishment of exclusion from the network, that is removal from the list of results, occurs if website owners break the local law, or try to hack and manipulate the search mechanism. Since laws differ in different countries, Google employs a team of international lawyers that can advise on matters of human rights or copyrights abuse on a case-by-case basis. The punishment is enforced locally; thus, for example, according to a report from the Berkman Center at Harvard University, Google France and Google Germany exclude websites that are anti-Semitic, pro-Nazi or related to white supremacy (McCullagh, 2002). Although such websites are not listed in the search results of Google France and Google Germany, they are still listed on the main domain, Google.com, and on any other international domain of Google, which means that, in this case, national laws are only partly observed.

The complex and often contradicting practices of search engines indicate that online laws and their enforcement are not always clear and mostly subject to a local rather than a global sovereignty. However, the global reach of the Internet and its ability to bypass national boundaries led to the emergence of global norms (e.g. assigning domain names by one authorized organization), and increasingly also trans-national social standards (e.g. freedom of expression, data protection, digital signature and even some laws against online crime in the EU) (Perri 6, 2002; Van Dijk, 2005).

While online crimes still subject to national laws, the new dynamic of the Internet has brought about new forms of cyber-crimes. Knowing the basic principles of Google's page ranking, some website owners artificially attempt to increase the number of links to their websites in order to appear first in the search results. Walker (2002) suggests that web-links have become a common unit to estimate online popularity and value, and therefore many try to manipulate search results and increase their website value by link exchange with other websites. Some have developed "link farms", that is networks of websites without any content at all, but a massive amount of links. Website owners can purchase links from a link farm owner, thus increase their ranking with Google. This trade is also known as the "black market for links".

Other manipulations of search results with political overtones are also exercised. A Google search for the queries "miserable failure" and "failure" returned until recently4 the personal webpage of George W. Bush. Similarly, a Google search for the query "liar" returned the personal webpage of Tony Blair. This manipulation is possible because Google counts how often a site is linked to, and with which words (also known as "anchor text"). Hence, information skilled users and those who produce online content through web-logs and forums can group together to manipulate search results, a phenomenon that is also known as "Google Bomb" (BBC News, 2003). Empirical tests indicate that it does not need a large number of websites to achieve a Google Bomb effect, but rather a small number of dedicated online users (Bar-Ilan, 2007; Battelle, 2005).

The technique of Google Bombing, that is manipulating search results by deliberately cross-linking certain words to certain websites, was widely used in the 2006 election campaign in the USA. Many Republican candidates became targets for a Google Bombing campaign, so that seeking their names on Google yielded in the top search results negative campaigning and criticism (Zeller, 2006). Still, politicians and political parties also exploit search engines, associating their websites with certain keywords and investing in search engine optimization. Using Google's advertising program, AdWords, politicians can either promote their party or fight against their competitors, so that a search for their competitors will lead to negative advertising.

Although these practices of diverting search results are still not illegal in any state, Google is aware of such attempts to bias the "integrity" of its own page ranking system. Once it discovers a link farm website, it permanently excludes it and its clients from the search results. Hence, while there is no punishment for link "prostitution" in the national level, there is a developing global regulation system online, with its own mechanisms of reward (i.e. inclusion or promotion) and punishment (i.e. exclusion or marginalization).

Apart from diverting search results, hackers and cyber-criminals have further developed a mechanism, also known as the "click fraud", to divert revenues from search engines. Based on the principle of Cost-Per-Click (CPC) search engines get money whenever someone clicks on a sponsored link. With the introduction of AdSense, Google has provided website owners with the ability to further distribute advertisements through their own websites as resellers. Consequently, a click fraud may occur if a person or an automated script imitates the legitimate action of click on advertisements in order to generate illegal revenues or in order to harm competitors. Click fraud is considered as a felony in some jurisdictions, and has therefore been prosecuted by the competent national authorities.

The Politics of Online Mapping

The excitement that the advent of Google Earth has brought to online users worldwide has been matched by the anxiety of national governments over the open online exposure of military and security installations. A careful examination of these online tools, however, reveals that some governments have considerably more control than others to censor the distribution of sensitive information. This is particularly true in the case of Google Earth and Google Maps since most high-resolution images are purchased from US-based companies that subject to US law. Some of the US allies may also benefit from its dominance over the distribution of online information. Thus, US law requires, for example, that certain images of Israel shot by American-licensed commercial satellites be made available at a limited resolution (Kalman, 2007; Hafner & Rai, 2005). One example, among many that can be mentioned, is that of the US Naval Observatory below that is obscured on Google Maps for national security reasons (see Figure 1).

Figure 1 - the US Naval Observatory
Google Map: US Naval Observatory

Google Map, accessed in June 2006.

While this displays self-censorship by aerial imaging companies as a result of national laws, the free availability and growing popularity of these images on Google Earth and Maps has persuaded governments to contact Google directly and to work together to censor certain "sensitive" images. Although these commercial images have been available for purchase from aerial imaging companies for many years, with the introduction of Google Earth and Google Maps they have become more freely and easily accessible, and have thus attracted much more attention and uses. An interesting example in this context is of the aerial images of British military installations in Iraq. Until January 2007, Google Earth displayed relatively recent imagery of the British headquarters in Basra. However, after an alarming report from the Daily Telegraph (Harding, 2007a) about terrorists using Google Earth, there was an imagery "update". The "current" image from Google Earth of the same place in Basra is however actually older, preceding the Iraq invasion and showing no military installations. Figure 2 displays this change as published in Ogle Earth (Geens, 2007).

Figure 2 - the British headquarter in Basra (before and after Google's update)
Google Map: British headquarter in Basra

Ogle Earth, accessed in January 2007.

The left-hand image, which was available on Google Earth until January 2007, and revealed the entire British headquarters in Basra, is more likely to have been taken in late 2004 or 2005, while the updated image on the right is from 2002. This discrepancy shows that Google replaced newer imagery with older imagery at the request of the Coalition forces in Basra. Similarly, it has been reported that Google blotted out various sensitive installations, including the Trident nuclear submarine pens in Faslane, Scotland and the eavesdropping base at GCHQ, Cheltenham, at the request of the British Government in order "to hinder terrorist attacks" (Harding, 2007b).

While Google Earth technology is used by insurgents to plan attacks on military installations in Iraq, and by Western Governments to censor those same targets, a report from the BBC News (North, 2007) indicated that Google Earth was also used to help people survive sectarian violence in Baghdad. Users established websites and integrated Google Earth's detailed imagery of Baghdad in order to plan escape routes from the violence of Shia vigilante police forces in local neighbourhoods. Thus Google Earth can serve as a crucial means to gain political goals, or in this particular example - a means for immediate survival.

Following Google's cooperation with the American and British Governments, the Indian Government made a similar attempt to modify the information on Google Earth. In February 2007 during a meeting between officials from the Indian Ministry of Science and Technology and Google Earth representatives, it was decided to camouflage certain military and scientific installations identified by the Indian Government (Deshpande, 2007). As it was felt that lower resolution or blurred images might attract further unwanted attention to sensitive locations, Google agreed to creatively distort building by adding structures and carefully masking certain aspects of the facilities without attracting the attention of users. It is therefore clear that Google Earth also engages in deliberate World misinformation if it is "convinced" of the need to do so.

Nonetheless, the fact that Google cooperates with certain governments and obscures sensitive installations does not always prevent from the online community to search for discrepancies and reconstruct high-resolution images from uncensored versions of the same places (which were previously available at Google or are still available in many other online map services). Hence, it is suggested that behind the so-called "transparent" services there are various political, economic and increasingly informational forces that continuously shape and reshape the representation of the World.

The Quest for Alternative Search

Together with the five big American corporations (i.e. Google, Yahoo, MSN, AOL and Ask.com) that control more than 90 percent of the search engine market (Nielsen//NetRatings, 2006), there are many5 small search engines that provide alternative means of information retrieval. The long tail theory, which shows empirically that profit lies behind the economics of abundance (Anderson, 2006), stresses the importance of the accumulative power of the products and services that are in less demand. When applied to the search engine market, it can be argued that although the many small search engines are not as popular and influential as the five big ones, together they may still provide a large volume of information to a decent number of users. Moreover, in order to survive they are required to offer innovative and useful alternatives for information retrieval that cannot be found in the bigger search engines. Consequently, they may develop important and useful technologies that could later rival the conventional search technologies or be acquired by one of the popular search engines and become a dominant search method. Hence, it is important to study the various alternatives that are currently available on the margins of the search engine market.

Knight (2007) indicated some of the recent developments in alternative search engines. He revealed that some search engines, such as Ms. Dewey, attempt to challenge the "clean" and simple interface of Google by providing a more appealing visual of a person that can interact with the users and conduct their searches. An interesting approach in the field of information retrieval is the application of natural language processing (NLP) technologies. It aims to "humanize" the interface and the interaction between users and search engines, enabling them to ask questions and get answers in their own language. Ask Jeeves was one of the first commercial search engines that implemented artificial intelligent technologies to recognize questions in natural language. Hakia, is a more recent search engine that makes use of semantic analysis and categorizes information by meanings and subjects. Other search engines (e.g. ChaCha and the former Google Answers) introduced a human mediated search, where online users can ask questions in their own language and other users help them to retrieve their answers.

Another interesting direction of development is the visual presentation of search results. While conventional search engines display results in a long one-dimensional list, some search engines (e.g. KartOO and Quintura) display a two- or three-dimensional map of interconnected websites. They cluster search results into topics, enabling online users to navigate through the map in search of the most relevant answers. For example, a search in KartOO for the word "empire" provides a two-dimensional map that clusters websites related to politics, education, shopping and entertainment, enabling users to "zoom in" and focus on their desired content.

Some alternative search engines elaborate the multi-dimensional display of search results, also adding the feature of recommendation of other related information. When users search for certain artists, music or films, these search engines (e.g. Music Map and Live Plasma) automatically recommend them some other related titles that they might never have heard of, but would most likely to enjoy. This mechanism, which was popularized by Amazon6 and is increasingly used by e-commerce websites, has important implications on the diversity of content and the increasing long tail of information production and consumption.

Finally, some alternative search engines (e.g. Dogpile, Zuulu and GoshMe), also known as meta-search engines, developed the capacity to aggregate search results from all popular search engines and from some other specific vertical search engines, attempting to cover the highest volume of the deep web. One should note that while using the search mechanisms of popular search engines, they also contribute to their traffic and do not fundamentally change the information order or the information retrieval processes.

Apart from commercial search engines, there are hundreds of open source engines (e.g. Nutch) developed by the online community (Battelle, 2003). These applications are not owned by private commercial companies, and thus enable anyone to use, modify and even profit from them as long as they contribute to the project. The main difference between commercial and open source search engines is that the latter are transparent; in other words, online users can look at their indexing and page-ranking mechanisms. Another important difference is that there are more skilled programmers who can potentially contribute to the development of an open source search engine than programmers employed in any commercial search engine. Wikia Search is a recent example of a non-commercial search engine promoted and supported by Wikia Inc. The four principles that currently guide its development and exemplify the search for alternative solutions are transparency (i.e. openness of the system and its algorithms), community (i.e. everyone can contribute and take part in the evolving enterprise), quality (i.e. improvement of search results based on relevancy and not commercial considerations) and privacy (i.e. not storing any identifiable data on the users and their information preferences).7 Nonetheless, it has been argued that the economy of scale plays an important role in the search engine market. As long as indexing technologies require a massive amount of storage,8 there is an obvious advantage to the commercial model of search engines. Only if a new technology enables private computers to voluntarily serve as a shared resource for global storage through the Internet, then open source search engines may have a case to challenge commercial ones.

Although there is an interesting and diverse market of alternative search engines, statistics indicate that the long tail of search engines is getting smaller, that is the most popular search engines continuously increase their share, while alternative search engines get fewer visitors (Hitwise, 2006). Moreover, the popularity of alternative search engines and their traffic ultimately depend on their ranking in the bigger search engines and particularly Google. Mowshowitz and Kawaguchi (2002) have summarized the possible implications of the search engine oligopoly:

The only real way to counter the ill effects of search engine bias on the ever expanding web is to make sure a number of alternative search engines are available. Elimination of competition in the search engine business is just as problematic for a democratic society as consolidation in the news media. Both search engine companies and news media firms act as intermediaries between information sources and information seekers. (Ibid: 60)

As with other commercial media channels, on the Internet there is a tendency to increase the diversity of content, while decreasing the diversity of forms and standards. While the variety of search queries increases, the number of search engines that control the online information market decreases. While the long tail of search queries becomes a significant economic factor, the long tail of search engines gradually atrophies.

The Future of Search Engines

In order to expand and cover more parts of the deep web, search engines introduced specific channels like Google News, Froogle and Google Scholar that integrate database websites and gather various "authoritative" information sources into one interface, enabling online users to compare between the sources and choose the most appropriate product or service (in terms of relevancy, popularity and price). This trend, which turns contemporary search engines into "supermarkets" of information, makes it easier for users to "shop" for information in one place. Yet, it also increases the competition between companies over their decreasing place on the "shelf", where more popular and commercial companies always win.

As part of the race to cover more of the deep web, the next generation of search engines will focus on analyzing and incorporating non-text based content in general, and multimedia content in particular, in their indices. As previously mentioned, the French government is currently engaged in developing a new search engine called "Quaero", which aims to recognize and index audiovisual files (O'Brien, 2006). While contemporary search engines analyze the textual content of webpages, the search engines of the future will be required to use techniques that transcribe and automatically analyze and translate the content of image, sound and video files, maintaining a large multilingual index, and enabling efficient retrieval of relevant information.

One of the implications of multimedia search is that it will allow different pictures to be associated with search queries in different languages. Thus, for example, search queries such as "beauty" or "terror" will return images from various countries,9 and not only images from English websites. To that end, the analysis and indexing of image and sound files may bring more information to more users regardless of their native language. This does not necessarily imply that multimedia search will challenge the dominance of English and Western culture. The principles of popularization and commercialization will still play a significant role, prioritizing more popular information, most of which still originate in economic and cultural hubs such as the USA. However, multimedia search will open a new frontier for companies to compete for the attention of each user. Together with the growing popularity of video- and image-sharing services (e.g. YouTube10 and Flickr), multimedia search is increasingly becoming indispensable for the information society. By analyzing the actual content of multimedia files, search engines will enhance their influence and control over online information uses, and widen and deepen the ability of users to retrieve information.

Apart from expanding the quantity of coverage, search engines are constantly concerned with increasing the quality of their search results. It is expected that, as a response to the struggle against link-farms and other manipulations of search results, search engines will further elaborate their page-ranking mechanisms. They will enhance their ability to evaluate the popularity of webpages, not only by the number of back-links, but also by the number of visits to websites from their search results. The more people click on a search result on Google, the more popular and "important" a website may be (Osinga, 2004). At the same time, search engines continue developing various other techniques for evaluating the actual popularity and traffic of a website. Google Toolbar, for example, enables Google to analyze what websites are visited by its users, and thus to estimate their popularity on an individual basis.

Ultimately, the increasing popularity and significance of search engines as influential and "authoritative" information agents also means that more website owners are required to consider the search algorithm, and modify their website's content and design accordingly.11 Gradually, search engines will set up global standards of content production, design and database structuring. Website owners who wish to increase their presence and reach more audiences will have to comply with these standards, which will in turn enhance their dependence on search engines.12

Conclusion

This paper followed the evolution of search engines, looking at their operations and politics, which not always provide equal opportunities for everyone. It was indicated that search engines continuously include, exclude and channel online information to promote their commercial interests and objectives. Applying page ranking mechanisms and promotion deals, accessing and distributing private as well as sensitive national information and a growing dependence on them are all part of the online politics of search engines, government and individuals that shape the dynamic information order.

The case study of Google Maps and Google Earth revealed that even behind seemingly "neutral" maps and aerial images there are certain commercial considerations and biases. Since most aerial imaging companies are American, the US Government and its allies have the greatest ability to control and censor certain images, while many other governments are gradually losing such control. Thus, the open information society becomes a bogus term, which is in practice reshaped and redefined by dominant political and economic actors. The fact that many trans-national information agents are American provides the USA with major military, political and international advantages, as it plays a dominant role in framing what is to be cartographically known.

Far from being an exclusive question of access, information inequalities emerge on the Internet, a most complex, diverse and elaborate multimedia, which provides new opportunities for various actors to generate and consume information and increase their presence and dominance. Its enormous potential for the capitalist market has been quickly discovered. As a result one operating system (Windows), one browser (Internet Explorer) and one search engine (Google) have increasingly become the dominant standards in obtaining and maintaining a certain information order. This paper has argued and indicated that beyond their beneficial mission to "organize the World's information and make it universally accessible and useful", popular search engines also reproduce and intensify inequalities and sustain the US-centric information order.

Acknowledgement

I would like to convey my gratitude and appreciation to Professor Costas Constantinou for his useful comments, suggestions and guidance, and to Keele University for supporting this research. A special thank to Marion Lupu, John Tresman and Regula Miesch for their careful proofreading of the manuscript.

References


Endnotes

1. The term "unique users" refers to the number of different individuals that used Google in a specific period. Marketers and website owners track the number of unique users in any given period by registering their IP addresses, browser ID, and so forth (SearchCIO, 2006).
2. Bergman (2001) estimated the deep web to be 550 times larger than the searchable web. More recent studies, however, questioned his methods and measurements, suggesting that his size estimates are far too high (Lewandowski, 2007). Still, although there is no common agreement regarding the accurate size of the deep web, most researchers indicate that search engines today can cover only a minor part of the deep web (He et al., 2007).
3. In June 2006 Google News website claimed to integrate 150 Arabic news sources, while some news agencies claim that Google News actually integrates 500 Arabic news sources (Chapman, 2006).
4. The lifecycle of Google Bomb is varied. The search queries "failure" and "miserable failure" linking to the personal page of George W. Bush appeared in December 2003 (BBC News, 2003) and still appeared in October 2006. By the end of 2006 Google made massive changes in its index to tackle this problem.
5. Some meta-search engines such as GoshMe indicate that there are more than half a million search engines on the Web.
6. This process is also known as collaborative filtering, where companies can compare the buying patterns of their customers and recommend relevant products (Broersma, 2001). 7. See also: http://search.wikia.com/wiki/Search_Wikia (accessed in 12 December 2007). 8. The Google Grid, for example, stores its index on more than 450,000 servers (Markoff & Hansell, 2006).
9. A Google Image search for the word "beauty" in English, Spanish, Hebrew and Turkish was conducted in January 2007, revealing mostly images of white females. This trend can be explained since many search results were associated with popular and commercial websites of trans-national cosmetic companies. The increasing role of search engines in framing our knowledge may therefore reinforce commercially biased understandings of the world.
10. It was estimated that in October 2006 YouTube had 72 million unique users (BBC News, 2006b).
11. The growing popularity of SEO (Search Engine Optimization) companies supports this trend.
12. It is interesting to view in this context Google's official announcement that it is going into competition with Microsoft over the software application market with its "Google Apps" (Reuters, 2007).


Bibliographic information of this paper for citing:

Segev, Elad (2008).   "Search engines and power: A politics of online (mis-) information."   Webology, 5(2), Article 54. Available at: http://www.webology.org/2008/v5n2/a54.html

Alert us when: New articles cite this article

Copyright © 2008, Elad Segev.