College and Research Libraries


Analysis of Retrieval Performance 
in Four Cross-Disciplinary Databases: 
Article 1st, Faxon Finder, UnCover, 
and a Locally Mounted Database 
Scott Stebelman 

As an increasing number of cross-disciplinary databases become accessible over 
the Internet, librarians are presented with the dilemma of which to choose to 
support patron research. Several factors, such as cost, retrospective coverage, 
and document delivery, are usually considered in making a decision. However, 
one key factor-citation retrieval performance-is often overlooked because 
comparative data have been unavailable. A study of four cross-disciplinary 
databases was undertaken to provide those data. In addition to citation fre-
quency distribution, two other variables were examined: percentage of unique 
periodicals cited per search and relevancy of citations to stated search topic. An 
analysis of the data is provided, with its implication for database selection. 

II 
he advent of commercial cross-
disciplinary databases that can 
be searched on the Internet has 
been welcomed and enthusias-

tically promoted by librarians. These data-
bases are seen as important supplements 
to traditional printed indexes, to special-
ized CD-ROM databases, and to other 
more expensive commercial systems, such 
as those produced by DIALOG and BRS. 
In some cases searches are free, while in 
others the library (or user) either pays an 
annual subscription fee or a fee for each 
search statement. If databases are ac-
cessed over the Internet, telecommuni-
cation charges are negligible. 

Several articles have been written 
about the merits and user reaction of one 
system versus another, but no article has 

been published that compares citation 
retrieval rates for the different systems.1 
This factor, however, is important to 
many researchers, who often need to lo-
cate as much literature as possible ger-
mane to their topics. 

To make such an assessment, three 
popular and extensively marketed data-
bases were chosen: Articlelst (a data-
base on OCLC FirstSearch), UnCover, 
and Faxon Finder. A locally mounted 
consortium database, called GENL, was 
also included in the assessment. This da-
tabase is comprised of six Wilson data-
bases: Readers' Guide to Periodical 
Literature, Business Periodicals Index, 
Humanities Index, Social Sciences In-
dex, General Science Index, and Index to 
Legal Periodical Literature. 

Scott Stebelman is a Humanities/Social Sciences Librarian at the Gelman Library, George Washington 
University, Washington, DC 20052; e-mail: scottlib@gwuvm.gwu.edu. The author wishes to thank the 
following librarians who assisted in the citation relevancy assessment: Daniel Barthell, Shmuel Ben-Gad, 
Deborah Bezanson, W. Chris Filstrup, Elizabeth Harter, Rebecca Jackson, ]ames Kaser, Patricia Kelley, ]ames 
Kelly, Caroline Long, and Virginia MacEwen. Of course any error in data analysis is attributable to the author. 

562 


METHODOLOGY 

Thirty subjects spanning a variety of 
disciplines were searched. These sub-
jects were chosen because they have 
been discussed frequently in the media, 
or because they have been common re-
search topics for students and faculty at 
the author's institution. The searches 
were conducted during a five-day pe-
riod in January 1994. Because GENLand 
UnCover include references predating 
1990, but Article 1st and Faxon Finder do 
not, the search period was restricted to 
1990-93. UnCover does not index book, 
motion picture, or music reviews, so 
these also were excluded. Newspaper 
articles and duplicate citations appear-
ing in the same search were also left out. 
To be consistent, searches were entered 
in the same manner for all databases; 
this provided advantages for Article1st, 
UnCover, and Faxon Finder, which auto-
matically "and" terms in bound phrases. 

RESULTS 

Table 1 indicates the citation frequen-
cies for the thirty searches. Figure 1 illus-
trates the differences. The range incitation 
frequency distribution is considerable, 
with the best performing database outpac-
ing the worst by a magnitude of 3.3 to 1. 

To place the citation counts in perspec-
tive, it is necessary to indicate the approxi-
mate numbers of periodicals indexed by 
each database at the time searches were 
conducted.2 

Article1st 8,500 
UnCover 14,000 
Faxon Finder 10,000 
GENL 2,200 
Hence, even though Article1st re-

trieved only 57 percent of the number 
retrieved by UnCover, it indexes only 61 
percent of the latter's journal number. In 
some cases the disparity in retrieval 
count can be largely explained by the 
disparity in numbers of journals in-
dexed by each database; however, this 
correlation breaks down when GENL' s 
figures are examined. The number of pe-
riodicals it indexes represents 6 percent 
of the total, yet it retrieved 45 percent of 
the total number of citations. Its nearest 
rival, UnCover, which indexes 40 per-

Analysis of Retrieval Performance 563 

800 

600 

400 

200 

Artide1st Faxon Find. UnCover GENL 

FIGURE 1 
Citation Frequencies 

cent of the total journals, retrieved only 
24 percent of the total citations. The ex-
planation for GENL's superior perform-
ance probably lies ·within its subject 
indexing, a feature lacking in the other 
three databases, and the frequency with 
which abstracts are included with cita-
tions.3 This last feature is also included 
in Article1st, and to a lesser degree in 
UnCover, but is totally absent in Faxon 
Finder. It should be noted that keyword 
searches in Article1st omit the journal 
title field; this would reduce its retrieval 
capacity vis-a-vis the other databases. 

A chi-square analysis was made to 
determine whether the statistical differ-
ences among the databases were signifi-
cant. When GENL is included in the 
analysis, p < .01, df = 3. Because GENL is 
a unique composite database, reflecting 
the idiosyncratic choices of a local con-
sortium, and because its subject index-
ing provides it an intrinsic advantage 
over the other databases, a separate chi-
square analysis was made which omit-
ted GENL. A significance level of .05 was 
established, but the differences among 
the three databases did not meet this 
level. The null hypothesis-that the fre-
quency distributions are attributable to 
chance-cannot be rejected. 

UNIQUE PERIODICAL 
CITATION COUNTS 

In addition to citation frequency counts, 
unique periodical citation counts are 


564 College & Research Libraries November 1994 

TABLEt 
CITATION COUNT BY SEARCH TOPIC 

Subject Article 1st Faxon UnCover GENL 

Abortion and sex education 1 0 1 13 
AIDS and Asia 6 6 11 19 
Arms control and China 3 5 4 18 
Art and psychoanalysis 3 6 5 23 
Autobiography and women 3 11 1 31 
Capital punishment and juveniles 1 1 0 5 
Copyright and piracy 9 6 8 56 
Epic poetry 4 4 9 27 
Fellini 14 18 10 12 
France and terrorism 0 1 4 7 
Frank Lloyd Wright 29 44 52 40 
Free trade, protectionism, and Mexico 0 1 1 19 
Gene therapy and ethics 6 6 3 11 
Humor and 19th century 0 0 3 2 
Hypertext and literature 0 0 3 2 
Ishmael Reed 1 6 3 8 
Islam and fundamentalism 5 5 3 40 
Jackson Pollock 7 12 7 16 
Leadership training 48 25 92 19 
New historicism 15 35 57 15 
Nuclear plants and Russia 2 0 2 0 
Ontological argument 8 12 14 8 
Ozone layer and Antarctica 1 3 1 13 
Poetry and San Francisco 0 2 2 2 
Pornography and the First Amendment 0 2 5 21 
Poverty and health 40 63 54 181 
Rap music and violence 2 1 2 22 
Suicide and drugs 7 6 8 56 
Transcendentalism 3 11 16 32 
Vietnam War fiction 1 1 5 6 
Total 219 293 386 724 

provided. Unique periodical citations is erences on a subject is manifestly supe-
defined as the number of individual pe- rior. This may not be the case if the rna-
riodicals cited in a given search; for ex- jority of citations come from a few 
ample, on the topic of "Copyright and sources. Conversely, a database that has 
Piracy," Articlelst retrieved nine cita- a lower citation frequency count none-
tions, seven of which were to different theless may be a valuable resource to 
periodicals. Unique periodical citation scholars because it retrieves citations 
count is viewed as an important factor in from a greater variety of publications. 
the assessment, because if a database Table 2 displays the data. Chi-square 
cited more magazines then its competi- analysis established the differences to 
tors, it would have an advantage (maga- be significant at p= <.01. The anomaly 
zines are published more frequently previously mentioned is disclosed in 
than journals). Second, a popular, and these statistics. Although GENL had the 
sometimes incorrect inference, is that a highest citation frequency and the high-
database that cites a high number of ref- est number of journals cited, it had the 


TABLE2 
UNIQUE PERIODICAL CITATIONS 

Unique 
Total Journals 

Database Cites Cited % 

Article 1st 219 189 86 
Faxon Finder 293 239 82 
UnCover 386 302 78 
GENL 724 467 65 

TABLE3 
CITATION RELEVANCY FIGURES 

Total Total 
Number Number % 

Database of Cities Relevant Relevant 

Article 1st 219 136 62 
Faxon Finder 293 199 68 
UnCover 386 215 56 
GENL 724 327 45 

lowest percentage of unique periodicals 
within its searches. That means its sub-
ject descriptors are retrieving more cita-
tions but to fewer individual titles. 
Moreover, those databases retrieving 
the fewest number of citations-Article 1st 
and Faxon Finder-have the highest ratio 
of unique periodicals to total citations 
retrieved. Finally, if GENL is omitted in 
the database comparison, an inverse cor-
relation exists between the number of 
journals indexed by a database and the 
percentage of unique periodicals cited. 
Article1st, indexing 8,500 journals, has 
the highest percentage, while UnCover, 
indexing 14,000, has the lowest. What 
these data suggest is that the numbers of 
journals covered do not necessarily pre-
dict unique journal citation strength. 

One might argue that this analysis is 
beside the point, given that the raw fig-
ures indicate that the databases indexing 
the highest number of journals retrieved 
the highest number of unique periodical 
citations. However, defining database 
superiority is not so simple: if Database 
A, which indexes 10,000 journals, re-
trieves 200 unique journal citations, and 
Database B, which indexes 7,000 jour-
nals, retrieves 180 unique journal cita-
tions, can one necessarily assume that 
Database A outperformed Database B? 

Analysis of Retrieval Performance 565 

Libraries may be more impressed with 
Database A for the sheer number of 
unique journals cited, but in terms of 
measuring the inherent tenacity of a 
database's retrieval performance (ex-
pressed as a ratio between the number 
of unique journals covered and the num-
ber of unique periodical citations re-
trieved), a cogent case could be made for 
Database B. In spite of this perspective, · 
however, the highest raw numbers will 
probably be compelling to most users, 
whose need for journal variety is often a 
paramount consideration. 

Unique periodical citation count 
is viewed as an important factor 
in the assessment, because if a 
database cited more magazines then 
its competitors, it would have an 
advantage. 

Another statistic judged to be useful 
was the number of unique journals cited 
in one database search and not cited by 
the other databases. This was thought 
valuable because libraries and users 
might want to know how rich a particu-
lar database might be in covering peri-
odicals not indexed by its competitors. 
If all four databases had the same 
searchable fiel~s, such an analysis could 
be undertaken; unfortunately, the search 
field discrepancies already noted pre-
cluded this. 

RELEVANCE 

Although a database might be suc-
cessful in retrieving high numbers of ci-
tations on a topic, it was uncertain how 
many of these were relevant. To make 
such a determination, twelve subject 
specialists at George Washington Uni-
versity evaluated the searches most 
closely congruent with their subject re-
sponsibilities. For example, the subject 
specialist in biology assessed citations 
retrieved from the "Gene Therapy and 
Ethics" search, and the subject specialist 
for art assessed those retrieved from the 
"Art and Psychoanalysis" search. As 
previously mentioned, those databases 


566 College & Research Libraries 

including subject descriptors and ab-
stracts have an advantage over those da-
tabases that do not. To control for these 
differences, subject specialists were in-
structed to base their decisions exclu-
sively on the citation and to ignore 
additional fields. Judgments of rele-
vancy were determined by only one cri-
terion: Was the subject of the citation 
germane to the search topic? 

It must be stressed that because of the 
large number of search topics and cita-
tions retrieved, interscorer reliability 
was not established for the data. Given 
the inherently subjective nature of these 
judgments, the results must be viewed 
as suggestive rather than conclusive. 

Table 3 displays the data analysis of 
variance, which established the differ-
ences to be significant at p= <.01. The 
pattern of data parallels that of table 2. 
GENL, highest in retrieval frequencies, 
is also highest in the number of relevant 
articles retrieved. However, the infer-
ences that can be drawn from these num-
bers are equivocal: while a large number 
of relevant articles were retrieved, this 
number-as a ratio of the total number 
of citations retrieved-was lowest among 
the four databases. This suggests that in 
spite of the subject descriptors, some 
other field in the database is producing 
false drops. It might be assumed it is the 
abstract field: abstracts can generate 
higher numbers of irrelevant citations, 
because key words within an abstract 
may be separated by any number of sen-
tences.4 However, Articlelst has the sec-
ond-highest relevancy rate; yet it also 
includes abstracts. More research is needed 
to explain this negative correlation. 

CONCLUSION 

The results of this study demonstrate 
that a database that includes subject de-
scriptors and a large number of abstracts 
has the ability to retrieve more citations 
than a database that restricts searches to 
the basic citation fields. Particularly 
noteworthy was the fact that GENLout-
performed its competitors on this meas-
ure, even though it indexed far fewer 
periodicals, and that the citations it re-

November 1994 

trieved yielded the highest number of 
unique periodicals. Although GENL re-
trieved the highest number of citations, 
the percentage of its citations that were 
judged relevant to the search topics was 
lowest among the databases. Ironically, 
those features that provided GENL with 
a retrieval ad vantage-descriptors and 
abstracts-may also have reduced preci-
sion. Statistical analysis of multiperfor-
mance measures reveals that none of 
these databases is clearly superior. 

Although GENL retrieved the highest 
number of citations, the percentage 
of its citations that were judged 
relevant to the search topics was 
lowest among the databases. 

In determining which database would 
be most advantageous for its patrons, 
libraries will probably consider other 
factors in addition to retrieval perform-
ance. For example, a database that yields 
a lower number of citations than its 
competitors might be more user-
friendly, and this factor may be weighed 
more heavily than others in making a 
final decision. A database might also be 
offered as part of a package by the ven-
dor that includes auxiliary databases 
critical to one's clientele. Cost, of course, 
is another factor: a database that can be 
searched freely over the Internet, such 
as UnCover, and that indexes unique 
periodicals not covered by the others 
services, will be inherently attractive. 
Finally, the document delivery fea-
tures of a service will probably be an 
important criterion for selection: 
those services that provide a multitude 
of suppliers, or that allow orders to be 
transmitted directly to the interlibrary 
borrowing unit, will be more competitive 
than those that do not. 

As this study indicates, the identifica-
tion of a superior database is not always 
an easy process. Performance has many 
measures, yet statistics can play a large 
part in determining which database is an 
appropriate institutional choice. 


Analysis of Retrieval Performance 567 

REFERENCES AND NOTES 

1. See Candace R. Benefiel and Steven Smith. "FirstSearch: A Survey of End-Users," OCLC 
Micron 7 (Dec. 1991): 16-18; Katherine Fuller McKenzie, "FirstSearch in Virginia Libraries," 
Virginia Librarian 39 (Apr. /June 1993): 21-23; Susan M. Riehm," A First Look at FirstSearch," 
Online 16 (May 1992): 42-53; and Karen R Snure, "The FirstSearch Experience at the Ohio 
State University," Library Hi-Tech 9, no. 4 (1991): 25-52. 

2. The GENL figure was derived by adding the periodicals listed in the Wilson paper editions. 
Figures for the other databases were given at the time of database logon or were indicated 
in the vendor's literature. 

3. A useful review of the strengths and weaknesses of free text versus controlled vocabulary 
searches can be found in C. P. R. Dubois, "Free Text vs. Controlled Vocabulary: A Reassess-
ment," Online Review 11, no. 4 (1987): 243-53. 

4. The increased recall and false drops that occur when the abstract field is searched has been 
noted by Carol Tenopir, "Searching by Controlled Vocabulary or Free Text?," Library Journal 
112 (Nov. 15, 1987): 58. 

IN FORTHCOMING ISSUES OF 
COLLEGE & RESEARCH LIBRARIES 

Customer Expectations: Concepts and Reality for Academic Library Services 
Christopher Millson-Martula 

Reactions of Academic Librarians to Job Loss through Downsizing: 
An Exploratory Study 
Gloria J. Leckie 

Electronic Infonnation Technologies and Resources: Use by University Faculty 
and Faculty Preferences for Related Library Services 
Judith A. Adams and Sharon C. Bonk 

A Strategic Analysis of the Delivery of Service in Two Library Reference Departments 
Elsa Sjolander and Richard Sjolander