Webology, Volume 2, Number 2, August, 2005

Home
Table of Contents
Titles & Subject Index
Authors Index

Search Engines and Resource Discovery on the Web:
Is Dublin Core an Impact Factor?


Mehdi Safari
Encyclopedia Islamica Foundation, Tehran, Iran

Received June 5, 2005; Accepted August 13, 2005


Abstract

This study evaluates the effectiveness of the Dublin Core metadata elements on the retrieval of web pages in a suite of six search engines, AlltheWeb, AltaVista, Google, Excite, Lycos, and WebCrawler. The effectiveness of four elements, including title, creator, subject and contributor, that concentrate on resource discovery was experimentally evaluated. Searches were made of the keywords extracted from web pages of the Iranian International Journal of Science, before and after metadata implementation. In each search, the ranking of the first specific reference to the exact web page was recorded. The comparison of results and statistical analysis did not reveal a significant difference between control and experimental groups in the retrieval ranks of the web pages.

Keywords

Metadata, Dublin Core, Resource discovery, World Wide Web, Search engines

Introduction

Granted that the current World Wide Web contains tremendous amount of information provided by millions of users all over the world, it should be admitted that the problem of discovering the relevant resources is not easy. The Web has enabled users to electronically publish information accessible to millions of people relatively easily, but as the quantity of its information grows, the ability of those people to find relevant materials has decreased dramatically and can be compared to looking for "a needle in the haystack."

To solve the problem of discovering web resources, search engines have been developed that can provide the users with a large body of results by a click. While the value of these tools should not be underestimated, they have many shortcomings as information retrieval systems. With all of their power to provide access to an enormous array of information, it has been shown that they are finding it difficult to cope with the explosion of web resources (Bharat & Broder, 1998; Lawrence & Giles, 1998; Bar-Ilan, 1998/99; Lawrence & Giles, 1999) and accordingly cannot be considered as perfect tools because of their low coverage. Their performance volatility (Rousseau, 1998/99; Snyder & Rosenbaum, 1999), fluctuations and changes in the results set over time (Peterson, 1997; Bar-Ilan, 1998/99; Rousseau, 1999; Mettrop & Nieuwenhuysen, 2001) and a generally low retrieval effectiveness (Gordon & Pathak, 1999) are some major shortcomings and deficiencies which are well documented in the literature. It seems that deficiencies of search engines in retrieving the relevant resources mostly originate from their strategy of indexing the Web. The way in which they index the Web, that indiscriminately harvest whatever they can find and then do selective indexing on those contents, coupled with the enormous mass of web resources results in overly large retrieval sets with low relevancy. It has made it clear that without enforcement of a more rigorous indexing strategy through some level of meta control, search engines effectiveness and efficiency in resource discovery will deteriorate. Since the content of the information resources has not the right and efficient information for them to be indexed effectively, some kind of descriptive information to impose pre-defined meaning on the Web content is essential.

The metadata movement for resource discovery on the Web

The high dynamics of web resources (Lawrence & Giles, 1999), both in size and content, as well as their unique characteristics (Heery, 1996), has posed many challenges for using the traditional procedures of resource organization and discovery, such as cataloging rules, in the Web environment. The challenges in the way of deploying cataloging rules for digital resources (Beacom, 2000; Huthwaite, 2001; Lagoze, 2000; Weiss and Carstens, 2001), have led to favoring "metadata" as the best means of describing and discovering resources on the Web.

Metadata is a heavily loaded term for which many definitions have been offered. It, in general, may be defined as structured data about data (Burnett, Ng & Park, 1999, p.1212). More specifically, it is a structured set of elements that describes the information resource for the purpose of identification, discovery and use of information (Lee-Smeltzer, 2000, p.206). To encompass the main perspectives on metadata and accurately reflect the current status of its studies, Burnett et al. (1999) define metadata as "data that characterizes source data, describes their relationships, and supports the discovery and effective use of source data" (p. 1212).

Metadata is a recent coinage though not a recent concept. The above definitions about metadata are usually followed by the observation that libraries have been producing, standardizing and maintaining metadata for a long time; because descriptive data such as standard bibliographic information, and indexing and cataloging information are all structured data that describe the attributes and contents of an information resource to facilitate their discovery and use, hence metadata. However, while the concept of metadata is a familiar one for information professionals, in today's jargon, this data is considered to "[be] structured so that it can become machine-understandable as well as machine-readable […] and has largely been identified with issues of Internet resource discovery" (Day, 1999). As Milstead and Feldman (1999) point out, this term "… is generally applied to electronic resources (though it doesn't have to be) and refers to "data" in the broadest sense--datasets, textual information, graphics, music, and anything else that is likely to appear electronically. While the concept includes indexing and cataloging information (information for "resource discovery" in Webspeak), it can go far beyond conventional document representations, such as MARC records."

Today metadata activities are unprecedented. Because of the exponential growth of information resources on the Web, they expand beyond the traditional library environment to deal with the problem of effective resource description and discovery. The accelerated growth in the related literature on the topic of metadata and the rapid decrease of the word cataloging (Ercegovac, 1999) as well as several metadata standards with different levels of richness and complexity originated from different communities (Heery, 1996; Dempsey & Heery, 1997; Burnett, Ng & Park, 1999), reveals the unprecedented movement towards metadata for resource discovery on the Web.

Dublin Core Metadata Initiative: a simple metadata for the Web

Within the diverse resource discovery activities of the mid 90's, ranging from unstructured indexing of full-text resources by search engines to richly-structured data like Machine Readable Cataloging (MARC) and Text Encoding Initiative (TEI) records, Dublin Core metadata standard arose as a means to mediate these extremes. it originated from a workshop sponsored by Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA) in March 1995 to "form an international consensus on the semantics of a simple description record for networked resources" (Weibel, Iannella & Cathro, 1997). It was believed that resource discovery is the most pressing need that metadata can satisfy (Weibel et al., 1995). Therefore, only descriptive data elements required to support resource discovery were considered and data elements covering other characteristics of the resource such as terms and conditions, archival status, and other types of metadata were not included (Dempsey & Weibel, 1996).

The primary deliverable from the OCLC/NCSA workshop was a set of thirteen metadata elements, named the Dublin Core Metadata Element Set (or Dublin Core, for short) by the workshop participants. The Dublin Core was proposed as the minimum number of metadata elements required to facilitate the resource discovery in a networked environment such as the Internet (Weibel et al., 1995); and until the third workshop, the elements were increased to 15 (Weibel & Miller, 1997). This metadata elements set includes Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, and Rights (Dublin Core, 1999).

The functions of the Dublin Core elements can be categorized into four classes, according to the four elementary uses of the bibliographic data. The IFLA statement on the purpose of bibliographic records identifies four 'generic tasks' the users perform and these records should support (IFLA Study Group on the Functional Requirements for Bibliographic Records, 1998): To find, to identify, to select, and to acquire or obtain the resource. Dublin Core metadata elements support these four generic tasks as follows (Dublin Core Metadata and the Cataloging Rules, 1998):

The implementation of Dublin Core elements on the Web requires a formal syntax. In 1996, a consensus concerning embedding metadata in HTML was reached at the W3C Distributed Indexing and Searching Workshop (Dempsey & Weibel, 1996). Because of the changes in HTML as well as a general need for greater formalization of the syntax, an Internet Draft authored by John Kunze (1999) was released after the debates at the sixth Dublin Core workshop, which explains how to encode Dublin Core elements in HTML. Current implementation of Dublin Core on the Web is often based on metadata embedded in HTML metatags.

Metadata Effectiveness: problem statement

With any metadata schema, there is a question of effectiveness. Does metadata provide a basis for increased effectiveness of retrieval by search engines? While there have been many studies done to evaluate search engines from different points of view, few studies have been done to test the effectiveness of metadata on resource discovery by search engines. Turner and Brackbill (1998) did a research on how the embedded metadata (HTML metatags) effects retrieval of web pages. The use of keywords metatag was shown to cause the significant improvement in the retrievability of a web page. However, another type of metatag (description metatag) exhibited no improvement in retrieval. Henshaw and Valauskas (2001) studied the effectiveness of Dublin Core metadata together with HTML keywords and description metatags on enhancing information retrieval in a suit of specific search engines. Results suggested that metadata did not play a significant role in increasing the likelihood of a web page being indexed or highly ranked by search engines.

This study aims to examine the following questions related to the use of four Dublin Core metadata elements, which are likely to be primary search categories for resource discovery:

Methodology

The web pages tested in this study, are a group of articles published in the home page of the Iranian International Journal of Science (freely available at: http://www.fos.ut.ac.ir/~journal/iijs.html). At the time of this research, the total number of the articles published online by this journal was 16 articles (see Appendix A). The articles were submitted to the major search engines (see table 1). Among these search engines, AOL Search, HotBot and Iwon failed to index the submitted articles and AlltheWeb, AltaVista, Google, Lycos, MSN Search, Excite and WebCrawler indexed the articles. Table 1 shows the results of searches of the titles in the search engines. The presence of the articles in the databases of the search engines is indicated by "+".

As table 1 indicates, MSN indexed only 4 articles and excluded from the study. Therefore, the maximum number of articles indexed by the maximum number of search engines is 10 and 6 respectively. These articles are shown by "*" sign. Table 2 shows the search engines that have indexed the articles and are tested in this study.

Table 1. The presence of articles in the database of Internet search engines
Article Allthe Web Alta Vista AOL Search Google HotBot Iwon Lycos MSN Search Excite Web Crawler
1 - + - + - - - + - +
2 - + - + - - - - - +
3 - + - - - - - - - +
4 - + - + - - - + - +
5* + + - + - - + - + +
6 - + - + - - - - + +
7* + + - + - - + - + +
8* + + - + - - + + + +
9* + + - + - - + - + +
10* + + - + - - + + + +
11* + + - + - - + - + +
12* + + - + - - + - + +
13* + + - + - - + - + +
14* + + - + - - + - + +
15* + + - + - - + - + +
16 + + - + - - + - - -

Table 2. Search engines tested in this study
Search Engine URL
AlltheWeb www.alltheweb.com
AltaVista www.altavista.com
Excite www.excite.com
Google www.google.com
Lycos www.lycos.com
WebCrawler www.webcrawler.com

In the next step, keywords were extracted from the articles. Keyword extraction was performed in accordance with their corresponding Dublin Core elements. All of the keywords in the titles were extracted as the value of element "title"; the keywords provided by the authors in each article were considered as the value of element "subject"; the first creator of each article was considered as the value of element "creator" and the other ones (if any) as the value of element "contributor". In the case of some articles in which there was a complete overlap between subject and title keywords (all of the keywords assigned as subject keyword by authors existed in the title), a keyword was extracted from the abstract as the value of element subject to avoid any common keywords between title and subject elements. Totally 82 keywords were extracted from articles.

Using the simple search of search engines and regarding the nature of searching in each of engines, the keywords were searched. The phrases were searched with double quotes, so that the entire phrase was searched rather than each word of the phrase. The retrieval rank of a web page in a search engine results list was used to measure performance of the web pages. The higher the retrieval rank of a web page in a search engine results list, the better its performance and vice versa.

As at the first Text Retrieval Conference, using 200 results was reported as a retrieval threshold (Turner & Brackbill, 1998, p.264), the first 200 results of each search were examined as an arbitrary cutoff point and the ranking of the first specific reference to the exact web page within those first 200 hits was recorded. If a search could have resulted in retrieving a page but it was not in the top 200 results, that keyword was given the rank of 201 for that search. Rankings, therefore, ranged from 1 (highest) to 201 (not retrieved).

In the next step, the web pages were randomly divided into two groups: experimental group and control group. It resulted in totally 43 keywords in experimental group and 39 keywords in the control group. The metadata elements were embedded into the web pages of experimental group through HTML metatags. Figure 1 shows the metadata elements of one of the web pages.

To ensure that search engines revisit the pages, a numeric character (digit 1) was added to the beginning of the titles of the pages in HTML title tag. Since search engines take the title of each page from this tag and show it to the user in the list of the retrieved results, showing the changed title (a title with digit 1) was considered as an indicator for ensuring that the search engines had revisited the pages through their continuous crawling and refreshing. It took 4 months for search engines to revisit all pages and among them, Google was the first one and AltaVista was the last one that revisited all pages. From the perspective of a content provider and regarding the high dynamics of the Web, it is highly suggested that the speed of revisiting web pages and updating the databases by Internet search engines might be improved.

Figure 1. Dublin Core elements in HTML metatags
<title> 1 Theoretical and Experimental Investigation on Back-scattered Low Energy Gamma Radiation from Different Metals <title/>
<META NAME="DC.Title" CONTENT ="Theoretical and Experimental Investigation on Back-scattered Low Energy Gamma Radiation from Different Metals">
<META NAME="DC.Creator" CONTENT="A. Pazirandeh">
<META NAME="DC.Subject" CONTENT="Compton scattering, Rayleigh scattering, Double scattering, Albedo spectrum, Coherent and incoherent scattering, photo-electric effect">
<META NAME= "DC.CONTRIBUTOR" CONTENT="N. Sobhkhiz">

The web pages were controlled in order not to be changed during the study. The only change was metadata implementation in the experimental group. Once search engines revisited the pages, the searches were repeated with the same search terms and exactly in the same fashion and the results were recorded for future comparison. Appendix B and C indicate the first (R1) and second (R2) ranking of the keywords in the experimental and control groups. To determine the differences between the first and the second ranks, the first rank of each keyword was subtracted from its second rank. The differences achieved are indicated in R3 column. The negative numbers indicate losing ground and positive ones indicate gaining ground in the ranking of keywords.

Analysis of Findings

To determine whether metadata elements have affected the retrieval rank of the web pages, the achieved ranks (R3) of the keywords in the experimental and control groups were compared. The comparison was made by running Mann-Whitney U test. The U statistic "is used to test the significance of differences in central tendency between independent groups when the scores are ranks or when ranks have been substituted for the original scores" (Willemsen, 1974, p.193). This provides for a comparison of two sets of ranked scores and tests rankings of at least rank-order or ordinal level data. Since the results provided by the search engines are considered ordinal level data (Turner & Brackbill, 1998, p.265), Mann Whitney U test can be used to determine the significance of differences of ranks between two independent groups in this study. This test was run by SPSS (Statistical Package for Social Sciences) software.

In order to answer the question that whether Dublin Core elements improve the retrieval rank of a web page, the R3 of the keywords in experimental and control groups were compared. R3 of two groups were compared to determine if there is a significant difference for two groups with respect to the web pages retrieval ranks before and after the metadata implementation. Table 3 and 4 represent the descriptive statistics and Mann-Whitney U test statistics of the comparison.

Table 3. Descriptive Statistics of experimental and control groups
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 258 -187 200 9.736 52.74
Control 234 -200 200 9.974 51.05

Table 4. Mann-Whitney U Test Statistics of experimental and control groups
Mann-Whitney U 29437.000
Z -.530
Asymp. Sig. (2-tailed) .596

The significance level (p) or sig. for all tests is 0.01. Adopting .01 or 1 per cent as the significance level at or below which the difference of ranks between control and experimental groups are unlikely due to the chance, we can determine whether these differences are statistically significant or not. That is, if P .01 then we can conclude that the differences between two groups are statistically significant. As table 4 indicates, the significance level (P=596) is greater than .01 (.596>.01). The R3 Mean of the web pages with metadata elements and those without metadata elements are 9.73 and 9.94 respectively. Therefore, there is no statistically significant difference between experimental and control groups with respect to their retrieval rank improvement. In other words, using Dublin Core elements (Title, Subject, Creator and contributor) did not affected the retrieval rank of the web pages.

Is retrieval performance of the major search engines improved after embedding metadata into the web pages? To answer the second question of this study, each search engine was considered separately. As search engines use different algorithms for indexing and ranking the web pages, to determine the significance of the differences of ranks between experimental and control groups, the U test was run for each search engine.

Tables 5 and 6 show the statistical results for AlltheWeb. From table 6, the differences between two groups are not significant, (P=.410 [>0.01]). In other words, there is no statistically significant difference between the experimental and control groups and therefore, the retrieval performance of AlltheWeb is not improved after metadata implementation. The R3 Mean for the web pages with metadata and those without metadata are -2.55 and -2.84 respectively (see table 5). This, therefore, shows that the web pages with metadata did not achieve better rankings than the web pages without metadata in AlltheWeb.

Table 5. Descriptive Statistics for AlltheWeb
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 43 -94 86 -2.55 27.44
Control 39 -94 74 -2.84 21.21

Table 6. Mann-Whitney U Test Statistics for AlltheWeb
Mann-Whitney U 785.500
Z -.825
Asymp. Sig. (2-tailed) .410

The statistical results for AltaVista are presented in the tables 7 and 8. It is evident that the differences between the pages in the experimental and control groups with respect to their retrieval rankings are not statistically significant because of P=.519 (>0.01). It suggests that the variances of the rankings for two groups are not substantially different. The R3 Mean ranks for the experimental group and the control group are -2.51 and -7.10 respectively. Therefore, we cannot assume that the metadata implementation has enhanced web pages retrievability and ranking in AltaVista.

Table 7. Descriptive Statistics for AltaVista
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 43 -39 2 -2.51 7.31
Control 39 -198 2 -7.10 31.97

Table 8. Mann-Whitney U Test Statistics for AltaVista
Mann-Whitney U 776.50
Z -.644
Asymp. Sig. (2-tailed) .519

Tables 9 and 10 demonstrate the statistical results for Excite. From tables, the web pages in the experimental group (R3 Mean =27) did not achieve the better ranking than the web pages in the control group (R3 Mean=32.38) because of P=.729 (>0.01). It suggests that the variances of the rankings for two groups are not substantially different. The web pages with metadata elements, therefore, did not achieve better performance than those without metadata elements with respect to their retrieval ranking in Excite.

Table 9. Descriptive Statistics for Excite
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 43 -187 200 27 85.04
Control 39 -9 200 32.38 69.26

Table 10. Mann-Whitney U Test Statistics for Excite
Mann-Whitney U 806
Z -.346
Asymp. Sig. (2-tailed) .729

Tables 11 and 12 show the statistical results for Google. From table 12, the differences between the rankings of the web pages in the experimental group and the control group are not statistically significant (P=188 [>0.01]). The R3 Mean for the web pages with metadata elements and those without metadata elements are 12.25 and 7.64 respectively. Therefore, we cannot assume that metadata elements have caused better performance for the web pages with respect to their retrieval ranks in Google.

Table 11. Descriptive Statistics for Google
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 43 -90 189 12.25 32.71
Control 39 -21 159 7.64 29.33

Table 12. Mann-Whitney U Test Statistics for Google
Mann-Whitney U 707.50
Z -1.31
Asymp. Sig. (2-tailed) .188

The statistical results for Lycos are demonstrated in the tables 13 and 14. It is evident that the differences between the web pages in the experimental group (R3 Mean=1.69) and the control group (R3 Mean=-2.25) with respect to their retrieval rankings are not statistically significant because of P=.434 (>0.01). It suggests that the variances of the rankings for two groups are not substantially different. Therefore, we cannot assume that the metadata implementation has enhanced web pages retrievability and ranking in Lycos.

Table 13. Descriptive Statistics for Lycos
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 43 -70 84 1.69 22.05
Control 39 -92 75 -2.25 20.81

Table 14. Mann-Whitney U Test Statistics for Lycos
Mann-Whitney U 762
Z -.783
Asymp. Sig. (2-tailed) .434

Tables 15 and 16 show the statistical results for WebCrawler. From tables, the differences of rankings between the web pages in the experimental group (R3 Mean=22.53) and the control group (R3 Mean= 32.02) are not statistically significant (P=.547 [>0.01]). Metadata elements, therefore, have not caused better performance for the web pages with respect to their ranking and retreivability in WebCrawler.

Table 15. Descriptive Statistics for WebCrawler
Groups N Minimum Maximum R3 Mean Std. Deviation
Experimental 43 -187 200 22.53 81.08
Control 39 -200 200 32.02 82.31

Table 16. Mann-Whitney U Test Statistics for WebCrawler
Mann-Whitney U 781.50
Z -.602
Asymp. Sig. (2-tailed) .547

Table 16 shows the significance level (P) of search engines. The significance level of every search engine is more than .01 and consequently, it suggests no statistically significant difference between the ranks of the pages with Dublin Core elements and the pages without Dublin Core elements. In other words, it is unreasonable to assume that the use of four Dublin Core metadata elements has led to an improvement in the retrieving and ranking of the web pages through six search engines: AlltheWeb, AltaVista, Excite, Google, Lycos and WebCrawler.

Table 17. Mann-Whitney U Test Results for Search Engines
Search engines Significance
Level (P=)
AlltheWeb .410
AltaVista .519
Excite .729
Google .188
Lycos .434
WebCrawler .547

Conclusion

The current strategy of search engines to indiscriminately harvest whatever they can find and then do full text indexing on those contents is an unsustainable for resource discovery and generally results in low relevancy in retrieval. Therefore, providing descriptive data to impose some level of meta control on the content is necessary for the current Web to be more effective and efficient. One of the best solutions to web resource discovery is the embedding of descriptive metadata in Web for harvesting by web index services. It consequently has led to developing and maintaining descriptive metadata schemas. Among the current metadata standards, Dublin Core has the potential of being adapted as an international standard for resource description and discovery on the Web and as a lingua franca for metadata. The goal of this study was to determine whether the Dubline Core implementation could improve web resource discovery via search engines. The effectiveness of four Dublin Core elements that concentrate on resource discovery was evaluated including Title, Subject, Creator, and Contributor. Two questions were considered in this study: Do Dublin Core elements improve the retrieval rank of a web page? and Is retrieval performance of the major search engines improved after embedding metadata into the web pages? Towards these aims, the articles published online by the Iranian International Journal of Science in the form of HTML pages (16 articles at the time of study) were considered as testing web pages and were submitted to 10 major search engines. The maximum number of search engines that indexed the maximum number of articles was 6 and 10 respectively. Keywords extracted from the indexed web pages were searched in the 6 search engines and their retrieval ranks were recorded for future comparisons. Dividing the web pages into two experimental and control groups, the metadata elements were embedded into the web pages of experimental group. After that search engines revisited the pages through their continuous crawling and refreshing, the searches were repeated exactly in the same way and the results were recorded. Mann-Whitney U Test was employed to compare the results and examine two questions.

Based on the statistical analysis discussed in the previous section, and regarding the first question of the present study, it was found that using Dublin Core elements did not improve the retrieval rank of the web pages. Mann-Whitney U test comparisons of rankings of pages with metadata elements versus those without metadata elements did not reveal a statistically significant difference at the .01 level. The lack of a significant difference between two groups of web pages shows that four Dublin Core elements do not affect the retreivability and ranking of web pages and consequently is not an impact factor for resource discovery on the current Web. To answer the second question of the study, the retrieval performance of 6 search engines (AlltheWeb, AltaVista, Google, Lycos, Excite and WebCrawler) before and after metadata implementation was examined. Final statistical analysis revealed that the difference of ranks of the pages with metadata and those without metadata in each search engine was not significant and thus the retrieval performance of none of the search engines improved after metadata use. It shows that Dublin Core metadata, as a well-known metadata schema, is not widely accepted and used by search engine designers and the spiders do not consider its elements while ranking the web pages.

Resource discovery is impossible without resource description and adequate resource description assures effective discovery (Dillon, 2001). It is believed that the greatest potential for improvements to the resource discovery on the Web lies in the use of metadata. Undoubtedly, there is value in the current search engines as the main resource discovery tools on the Web, which operate without the aid of descriptive metadata. However, for them to be more effective and efficient metadata has to matter and they have to move beyond the full text indexing of the Web. Creating the metadata schemas for web resources is essential, but not sufficient. For a metadata schema to be an impact factor in resource discovery, it has to be widely accepted and deployed both by content providers and by web indexing services in a systematic way. As Lynch (2001, p.14) asks, if web indexing services do not use metadata, who will go to the expense and trouble of creating and maintaining it?

Acknowledgment

The Author would like to express his special gratitude to Prof. Abbas Horri for his encouragements and assistances. The author also would like to give thanks to Mr. Keyvan Salehi who helped with the statistical analysis and Mr. Darush Alimohammadi for his ongoing and useful debates on the subject.

References




Appendix A: The articles' titles of the Iranian International Journal of Science
No. Title
1 Some Production Comparisons of Two Celluloytic Fungi
2 Third Virial Coefficient and Compressibility Factors for Dense Spherical Gases Using the HFD-C Potential
3 Digenetic Studies, a Key to Reveal the Timing of Oil Migration: an Example from the Tirrawarra Sandston Reservoir, Southern Cooper Basin, Australia
4 Optimal Control of an Inhomogeneous Problem by Using Measure Theory
5 Probe Diagnostics of Confined Plasma Produced by 13.56 MHz R.F Plasma Source
6 Occurrence and Distribution of Aquatic saprolegniaceae in Northwest and South of Tehran
7 Effect of Pectic Acid and b-Glocan on Prolactin Secretion by Ovine Pituitary Explants
8 Deformational Behavior of Quartz and Feldspar in Quartzites within Shear Zones in the Adelaide Hills Area, South Australia
9 Construction of some Join Spaces Boolean Algebras
10 Characterization of Certain Infinitely Divisible Distributions
11 Theoretical and Experimental Investigation on Back-Scattered Low Energy Gamma Radiation from Different Metals
12 Cytogenetic Biomonitoring of Workers Occupationally Exposed to Aromatic Solvents
13 Notes on the Distribution, Climate and Flora of the Oil Field Areas, South-West of Iran
14 Isotopic Signature of the Diagenetic Fluids and Cement in the Tortachilla Limestone, South Australia
15 Correlating marine Palynomorph Variations with Sequence Boundaries of Upper Jurassic Sediments in a Basin of Northern Switzerland
16 On Approximately Convex Functions

Appendix B: The rankings of the keywords in the experimental group
    AlltheWeb AltaVista Excite Google Lycos WebCrawler
Element Keyword R1R2R3 R1R2R3 R1R2R3 R1R2R3 R1R2R3 R1R2R3
Title Energy Gamma Radiation 4036+4 29290 2012010 178136+42 4036+4 2012010
Title Metals 2012010 2012010 2012010 2012010 2012010 201201 0
Title Pectic Acid 14-3 67-1 110 76+1 14-3 110
Title b-Glucan 59-4 2630-4 1530-15 7453+21 59-4 1529-14
Title Prolactin Secretion 323-20 10100 14201-187 16092+68 323-20 14201-187
Title Ovine Pituitary Explants 21+1 110 110 110 21+1 110
Title Deformational Behavior 110 86+2 2011+200 110 110 2011+200
Title Quartz 2012010 2012010 2012010 2012010 2012010 2012010
Title Feldspar 2012010 2012010 2012010 201153+48 2012010 2012010
Title Quartzites 9682+14 162201-39 20155+ 146 20112+189 9782+15 20157 +144
Title Shear Zones 2012010 165187-22 20140+161 2012010 2012010 20130 +171
Title Adelaide Hills Area 10115+86 16-5 20139+162 201201 0 9915+84 20145 +156
Title South Australia 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Title Probe Diagnostics 102175-73 1110+1 201201 0 8854 +34 127175 +48 201201 0
Title Confined Plasma 201 201 0 2326 -3 201201 0 166103 +63 201201 0 201201 0
Title R.F Plasma Source 11 0 11 0 2017 +194 5944 +15 11 0 2018 0
Title Isotopic Signature 100194 -94 201201 0 201201 0 201201 0 100144 -44 201201 0
Title Diagenetic fluids 6910 +59 201201 0 22201 -179 5529 +26 7010 +60 22201 -179
Title Cement 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Title Tortachilla Limestone 71 +6 201201 0 11 0 43 +1 71 +6 11 0
Title South Australia 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Compton scattering 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Reyleig scattering 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Double scattering 82160 -78 2227 -5 20157 +144 4929 +20 83153 -70 20158 +143
Subject Albedo spectrum 127 +5 11 0 2011 +200 105 +5 129 +3 2011 +200
Subject Coherent and incoherent scattering 5462 -8 1826 -8 20155 +146 3938 +1 5458 -4 20155 +146
Subject Photo-electric effect 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Plant extracts 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Intracrystalline deformation 43 +1 33 0 610 -4 1411 +3 43 +1 610 -4
Subject R.F Plasma reactor 38 -5 44 0 20115 +186 2725 +2 48 -4 20115 +186
Subject Stable isotopes 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Meteoric cement 11 0 13 -2 52 +3 56 +1 11 0 52 +3
Subject Diagenesis 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Isotopic composition 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Creator A. Pazirandeh 22 0 201201 0 39 -6 201201 0 22 0 39 -6
Creator Houri Sepehri 11 0 12 -1 11 0 11 0 11 0 11 0
Creator Ali Yassaghi 23 -1 11 0 22 0 47 -3 23 -1 22 0
Creator M. Khorassani 12 -1 22 0 91 +8 11 0 12 -1 91 +8
Creator Hossain Rahimpour-Bonab 56 -1 45 -1 32 +1 22 0 56 -1 32 +1
Contributor N. Sobhkhiz 11 0 11 0 11 0 11 0 11 0 11 0
Contributor Roya Zoraghi 22 0 45 -1 32 +1 45 -1 22 0 21 +1
Contributor Ali Haeri Rouhani 11 0 11 0 11 0 11 0 11 0 11 0
Contributor Yvonne Bone 4947 +2 726 -19 201201 0 3039 -9 5047 +3 201201 0

R1: The first retrieval rank
R2: The second retrieval rank
R3: The difference achieved

Appendix C: The rankings of the keywords in the control group
    AlltheWeb AltaVista Excite Google Lycos WebCrawler
Element Keyword R1R2R3 R1R2R3 R1R2R3 R1R2R3 R1R2R3 R1R2R3
Title Boolean Algebras 2012010 7476-2 2012010 2012010 2012010 201201 0
Title Join Spaces 110 550 2018+193 1612+4 21+1 2018 +193
Title Infinitely Divisible Distributions 2821+7 3235-3 20138 +163 3228 +4 2821 +7 20130 +171
Title Cytogenetic Biomonitoring 2536 -11 25 -3 201201 0 1221 -9 2532 -7 201201 0
Title Aromatic solvents 201201 0 2220 +2 201201 0 201201 0 201201 0 201201 0
Title Marine Palynomorph 2220 +2 21 +1 257 +18 34 -1 2217 +5 257 +18
Title Sequence boundaries 201201 0 4038 +2 201201 0 12982 +47 201201 0 201201 0
Title Upper Jurassic Sediments 656 -50 3201 -198 201201 0 1816 +2 652 -46 201201 0
Title Northern Switzerland 107201 -94 201201 0 201201 0 4630 +16 109201 -92 20124 +177
Title Climate 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Title Flora 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Title Oil Field Areas 817 +74 11 0 20129 +172 2810 +18 827 +75 20129 +172
Title South-West of Iran 6673 -7 98 +1 20137 +164 20142 +159 6673 -7 20138 +163
Subject Hypergroup 3048 -18 108 +2 201201 0 2119 +2 3248 -16 201201 0
Subject Algebraic hyperstructure 11 0 11 0 21 +1 11 0 11 0 21 +1
Subject Infinite divisibility 201201 0 7075 -5 201201 0 4255 -13 201201 0 1201 -200
Subject Strictly stable distributions 22 0 11 0 2012 +199 34 -1 22 0 2012 +199
Subject Cauchy distribution 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Normal distribution 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Characteristics function 201201 0 105120 -15 201201 0 199123 +76 201201 0 201201 0
Subject Unimodality 201201 0 4879 -31 201201 0 125146 -21 201201 0 201201 0
Subject Chromosomal Aberrations 201201 0 183201 -18 201201 0 201201 0 201201 0 201201 0
Subject Lymphocytes 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Occupational Exposure 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Organic Solvents 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Dinoflagellates 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject Rhodano-Swabian basin 11 0 11 0 11 0 11 0 11 0 11 0
Subject Floristic composition 201201 0 119123 -4 201201 0 201201 0 201201 0 201201 0
Subject Saharo-Sindian region 34 -1 11 0 2011 +200 21 +1 32 +1 2011 +200
Subject Plant geography 201201 0 201201 0 201201 0 201201 0 201201 0 201201 0
Subject SW. Iran 4753 -6 1715 +2 20150+151 6647 +19 4751 -4 20148 +153
Subject Khuzistan 2322 +1 2125 -4 201201 0 3440 -6 2322 +1 201201 0
Creator Ali Reza Ashrafi 56 -1 711 -4 716 -9 1011 -1 56 -1 716 -9
Creator M. Hossein Alamatsaz 11 0 11 0 11 0 11 0 11 0 11 0
Creator Hossein Mozdarani 69 -3 55 0 45 -1 86 +2 69 -3 45 -1
Creator Ebrahim Ghasemi-Nejad 79 -3 32 +1 75 +2 22 0 78 -1 75 +2
Creator Ebrahim Alaie 11 0 12 -1 11 0 11 0 11 0 11 0
Creator Shirazeh Arghami 11 0 11 0 11 0 11 0 11 0 11 0
Creator A. Ghahreman 12 -1 33 0 122 +10 33 0 12 -1 122 +10

R1: The first retrieval rank
R2: The second retrieval rank
R3: The difference achieved


Bibliographic information of this paper for citing:

Safari, M. (2005). "Search Engines and Resource Discovery on the Web: Is Dublin Core an Impact Factor?" Webology, 2 (2), Article 13. Available at: http://www.webology.org/2005/v2n2/a13.html

This article has been cited by other articles.

Copyright © 2005, Mehdi Safari.