casey.p65


586  College & Research Libraries November 1999

 
An Analytical Index to the Internet: 
Dreams of Utopia 

Carol Casey 

All the current methods of accessing Internet resources fall short when it 
comes to locating discrete pieces of information and digital objects. One 
solution to this problem is to create analytical indexes to the Internet. 
This article explores the need for analytical indexes by looking at cur­
rent Internet access, traditional bibliographic control, and Web site de­
sign. A discussion of some of the issues and problems concerning the 
development of these indexes, including the design and the resource 
selection process, emphasizes the impossibility of a comprehensive 
analytical index to the Internet. The creation of small, focused indexes 
may be the best solution for accessing specific types of digital informa­
tion. 

Perfect bibliographic control would 
imply a complete record of the ex­
istence and location of every book, 
every document, every article, even 
every written thought. The prob­
abilities of ever reaching such a uto­
pia are remote.1 

n 1967, Robert B. Downs and 
Frances B. Jenkins had a vision 
for information access that is 
admittedly impossible for a 

physical collection, but pocket utopias are 
possible in the digital environment in the 
form of analytical indexes. Unlike physi­
cal materials, digital resources possess 
properties that make detailed analysis and 
the ability to access the exact location of 
these details possible. But, as visitors to the 
World Wide Web discover, it is a frustrat­
ing task to find specific pieces of informa­
tion or digital objects, despite teasing links 
that suggest they are only a Web site away. 
The fact is “the information is at our fin­

gertips; we simply lack the ability to get it 
when and where we need it.”2 

More often than not, “a search for spe­
cific data can degenerate into an afternoon 
of dead ends, blind leads, and false drops, 
if not outright misinformation.”3 Staff 
members at Nashville State Tech Library 
lament that “even when our focus is fairly 
narrow and we choose our URLs carefully, 
we probably identify only one site worthy 
of inclusion for every 10 or more sites vis­
ited.”4 Add to this long-page downloads, 
large images, and generally unnecessary 
plug-ins and special scripts, and finding 
information on the Web becomes too time-
consuming and inefficient. Too often, the 
searcher gives up and goes to the library 
stacks taking two steps at a time. Frequent 
visitors to the Web conclude that access­
ing digital information is not “as conve­
nient for random access as a person using 
a book; turning back and forth between 
two pages for comparison (once one has 
found the two pages) is as fast as one cares 

Carol Casey is Head of the Catalog Department in Dupre Library at the University of Louisiana at Lafayette; 
e-mail: casey@mindancer.com. 

586 

mailto:casey@mindancer.com


An Analytical Index to the Internet 587 

to have it, and no command structure and 
windowing capability stands in the way.”5 

Just as a library of uncataloged, hap­
hazardly shelved books is of little use 
when searching for a specific book, the 
Web will never be a true research tool and 
resource until a means of directly access­
ing discrete pieces of information and 
digital objects is developed. It is clear that 
“the information highway will not magi­
cally sort itself out.”6 One solution is to 
rethink the way the Internet, especially 
the Web, is currently accessed. Although 
many directions can be pursued, this dis­
cussion focuses on the concept of an ana­
lytical index to Internet resources. 

Search engine results are often like 
book index entries without page 
numbers and with a couple of digital 
twists. 

A closer look at the current means of 
access and approaches to bibliographic 
control is necessary to put the concept of 
an analytical index into perspective. The 
most prevalent means of access, if a re­
searcher does not have the address for a 
specific resource, consists of keyword 
search engines, hierarchical subject in­
dexes, special indexes, and compilations 
of links. Each method of access has a place 
in the digital environment, but none has 
the focus of even the simplest online li­
brary catalog to consistently lead a re­
searcher to relevant resources. 

Keyword Search Engines 
Online helpers like keyword search en­
gines such as Altavista (http:// 
www.altavista.com) are little more than 
word-spewing spiders that crawl through 
Web sites weighing the content according 
to numerical rather than logical measure­
ments. They are “undiscriminating, likely 
to rank a press release as highly as an in­
vestigative news report, and to include 
everything on a subject no matter how dim 
or deluded.”7 Limited to reading ASCII 
text, search engines cannot detect the text 
found in graphic formats. If sites with 
graphic-based text do not contain alterna­

tive text-only versions or substantial 
metadata statements, they are all but inac­
cessible via keyword search engines. 

Search engine results are often like 
book index entries without page numbers 
and with a couple of digital twists. The 
keyword is on the page, but where it is 
and in what context cannot always be 
determined. It may be tucked into a 
metadata statement, only visible when 
viewing the page source. The statement 
may carry a summary for the entire site, 
not just the page accessed by the search 
engine, which increases the chances that 
the page may not contain the sought-af­
ter information. Sometimes the keyword 
is no longer on the page because search 
engines only sporadically revisit sites al­
ready indexed in the engine’s database. 
If the content of a page is changed or the 
page is removed from the Web, it may be 
weeks or months before it is re-indexed. 
Meanwhile, the engine’s database con­
tains all the data for the page compiled 
by the robot on the last visit to the site. 

Keyword search engines are analytical 
in the worst sense of the concept. They 
pull individual pages within a Web site 
out of context. Sometimes the context is 
completely lost because there are no links 
to the rest of the site. This happens when 
Web site designers approach the layout 
as if the site will always be accessed 
through the home page first and then re­
lies on visitors to use the browser back 
button to return to previously viewed 
pages. Even librarians, focused on a single 
purpose for a site, occasionally overlook 
the importance of providing navigational 
links on each page of the site. Referring 
to setting up a legislative history Web 
page for an introductory class on using 
government documents, Elaine Hoffman 
stated: “By clicking on a particular title 
they would automatically link the user to 
that separate page. By pressing the left 
arrow key the user could return to the 
main Web page. In the future I plan to add 
‘RETURN’ buttons to make the functions 
more apparent.”8 Disembodied Web 
pages are similar to random pages ripped 
from a book and then offered to a reader. 

http:www.altavista.com


588 College & Research Libraries November 1999 

Grasping the “world wide” part of the 
World Wide Web is a slow process. Unless 
mounted on an Intranet, there is no such 
thing as a strictly local Web site. No mat­
ter the purpose of the site, Web designers 
should work on the premise that search 
engines are going to crawl through it and 
pull pages out of context for anyone in the 
world making the search. Another barrier 
to finding specific information when us­
ing keyword search engines is that “search 
engine developers have devised different, 
albeit at times only slightly varied, syn­
taxes for entering queries.”9 Keywords 
must be entered in specific ways to get the 
best-structured search results, and each 
engine has its own set of search param­
eters. Unfortunately, patrons “want the in­
formation now; they do not want to learn 
or read anything—they just want to sit 
down and use the system,”10 so they usu­
ally do not make the effort to learn how to 
use each search engine properly. 

Although keyword search engines are 
capable of bringing the researcher closer 
to a relevant Internet resource, at best, it 
is like wandering through a subject area 
in the library stacks and thumbing 
through the most promising books. At 
worst, it is like sorting through the torn-
out pages of books. 

Hierarchical Subject Indexes 
There is little doubt that “it takes human 
intelligence to compensate for the search 
engine’s limitations and turn technology 
into a functional research tool.”11 Indexes 
such as Yahoo! (http://www.yahoo.com) 
are subject guides to Internet resources 
presented in searchable hierarchical lists. 
The idea is that the searcher “works down 
into the hierarchies beginning at the top 
with the broadest category, and proceeds 
through the subcategories until arriving, 
it is hoped, at the goal—the resource that 
will deliver the sought-after informa­
tion.”12 More important, the resources are 
manually selected and analyzed. But 
there is still a question of quality control 
“since the people creating many of these 
subject indexes lack the librarians’ train­
ing in evaluation and description, mate­

rials may be included simply because they 
fall into the right subject category.”13 Un­
like keyword search engines, these in­
dexes cannot come close to a global cov­
erage of the Internet so they try to present 
as high a quality resource as possible. 

If a keyword search engine database is 
comparable to a huge roadside junk shop, 
subject indexes are online equivalents of 
comprehensive bookstores. Subject in­
dexes offer something for everyone, but 
the final decision on the quality and use­
fulness of the information resource is the 
responsibility of the visitor. 

Yahoo! probably is the best-known and 
most emulated hierarchical subject index. 
The information about each resource in 
the Yahoo! database is collected and cata­
loged by humans rather than machines 
and is presented in a neat, uniform dis­
play, echoing bibliographic records. Just 
as a keyword search of an online library 
catalog reaches the different parts of a 
bibliographic record, a keyword search of 
the Yahoo! database encompasses the Web 
site title, the prepared commentary for the 
title, a list of keywords, and the URL. 

Unlike keyword search engine results, 
subject index hit lists point to the whole 
Web site rather than a page from it. Be­
cause of this, subject indexes suffer from 
the same lack of accessibility depth found 
in many library online catalogs. In librar­
ies, the main unit of bibliographic cur­
rency is the package containing the intel­
lectual work or works, and this concept 
is transferred to Internet resources. Just 
as a collection of books arranged by sub­
ject is an improvement over a warehouse 
of unsorted books, subject indexes are a 
better choice than keyword search en­
gines for quickly locating a Web site on a 
specific subject. 

Special Indexes and Compilation of 
Links 
Some special indexes, many compiled by 
librarians, offer Internet resources in an 
organized format, usually by subject. What 
sets these apart from an index such as Ya­
hoo! is that the content of the resources is 
evaluated for quality and accuracy. An ef­

http:http://www.yahoo.com


An Analytical Index to the Internet 589 

fort is made to choose Internet resources 
following the same criteria used to select 
other materials for the library collection 
because “collection development is just as 
important for electronic resources as for 
print—perhaps even more so, because of 
the difficulty of locating resources on the 
Internet or the Web.”14 

Because they are not commercially 
driven, most of these indexes are known 
primarily to the intellectual community they 
serve and to a percentage of visitors who 
discover them while searching the Internet. 
A well-established, value-added index com­
piled by librarians is The Internet Scout Project 
(http://www.scout.cs.wisc.edu/scout/re­
port/). This ambitious project tries to pro­
vide “Internet users, particularly those in 
higher education, with current, selective, 
and well-annotated pointers to the best that 
the Internet has to offer.”15 Most of this work 
is performed by professional librarians and 
subject experts, and the resources are cata­
loged using Library of Congress Subject 
Heads and Classification, plus Dublin Core 
elements. They are archived at The Scout 
Report Signpost (http://www.signpost.org/ 
signpost/index.html) for quick access. 

If hierarchical subject indexes such as 
Yahoo! are like bookstores, these value-
added resource guides are close relatives 
of a library collection. The emphasis is still 
on the Web site as the basic unit of infor­
mation packaging, but because the evalu­
ator has the opportunity to examine the 
site first, the quality and accuracy of these 
resources have the potential of being uni­
formly higher than the quality and accu­
racy in a library collection. Most library 
materials are selected sight unseen 
through blanket orders or based on re­
views and recommendations. 

The compilation of favorite or related 
links seems to have become a part of gen­
eral Web site design. Because creating 
links is so easy to do, there is a certain 
lack of discrimination when linking to a 
site. Even on library Web sites “there is a 
strong tendency to want to include links 
to as many remote resources in one’s Web 
collection as possible.”16 Some of these 
collections of links are of high quality, 

others are compiled haphazardly. Al­
though a nice simple compilation of links 
on a specific topic (much like a subject 
bibliography) can be an extremely valu­
able resource, it is important to remem­
ber that “it only takes one casual link from 
a reputable site to lead into a morass of 
meaningless sites.”17 

The curators of The Internet Scout Project 
understand that “fruitful information re­
trieval requires repositories of organized 
collections of information with indexing 
systems that ensure efficient retrieval.”18 

Although these types of indexes come 
closer to this goal than keyword search 
engines and hierarchical subject headings, 
the nature of Web sites themselves presents 
the greatest barrier to access. 

Thumbing through Web Sites 
Before the development of the graphical 
user interface (GUI), Internet resources 
were text files. This format worked well 
as long as graphics and sounds were not 
needed to illustrate the text. Because hu­
mans communicate and create in an as­
sortment of visual and aural mediums, 
many of the ASCII-based Internet re­
sources were limited and had to be 
supplemented by physical materials. 

The early Internet was never a threat to 
the book as a comparable, or even supe­
rior, form of presenting information. The 
Web may rival physical formats someday, 
but not until visitors have equal access to 
the optimal technology to handle the end­
less creative Web design and layout. Even 
then, the proper technology cannot over­
come all the access problems because 
searching the Internet is still “a complicated 
job of sifting through massive amounts of 
information to find the few valuable nug­
gets.”19 Theoretically, it does not sound 
much different from thumbing through a 
range of library books. If a hit list is com­
parable to a range of books, the ratio be­
tween relevant and nonrelevant sites de­
pends on how the Internet is searched. 
Books on library shelves are placed accord­
ing to a logical, usually subject-based clas­
sification scheme. Keyword searches, even 
in hierarchical or special indexes cannot 

http:http://www.signpost.org
http://www.scout.cs.wisc.edu/scout/re


590 College & Research Libraries November 1999 

guarantee any logic or even relevancy in 
on-the-spot compilations of links. 

Problems with access are not gone af­
ter a promising Internet resource is found. 
With specific information goals in mind, 
researchers often want to take only a 
quick peek at a page. Books allow read­
ers the luxury of flipping between a page 
of text and a color photographic plate 
without waiting for various elements to 
load and reload into a browser. Brows­
ing aids in books such as indexes and 
detailed tables of contents leads to pages 
containing the words or concepts the re­
searcher is seeking, saving the effort of 
skimming every page. 

Web sites, on the other hand, are gener­
ally not designed for quick glances or 
thumbing through for an overview. The 
unique “new qualities of the Web—graph­
ics, sound, animation, and digitization— 
make it an exciting, revolutionary me­
dium, but it is also ever-changing, chaotic 
and unorganized.”20 Most Web sites do not 
have detailed indexes that point to specific 
text, images, or sounds. This means that 
“the researcher is always looking at infor­
mation through blinders, one Web page at 
a time. It’s fairly easy to get confused and 
off track if webmasters have not con­
structed their pages well.”21 Visitors may 
have to wait while images, long-text pas­
sages, tables, and various plug-ins and 
scripts load into the browser just to take a 
quick peek at a page. Often more time is 
spent watching pages download than ac­
tually finding the needed information. 

Because of the lack of logical layout in 
site design, the Web “was not designed to 
support the organized publication and re­
trieval of information, as libraries are.”22 

Many Web site designers see the site only 
in terms of itself or the purpose for which 
it was created and forget that keyword 
search engines dissect sites and present bits 
and pieces of them in search results. Al­
though a “common practice” is evolving 
for Web sites, there are still many older sites 
and sites created by newcomers to the Web 
who inevitably make the same design er­
rors. Until technology allows for quick ac­
cess to all Web pages regardless of content 

and design, an alternative to the current 
holistic approach to Internet resources is 
the only way to isolate information so visi­
tors can access it quickly. 

Just as it is possible to compile satis­
factory subject indexes to the Internet, an 
analytical index can be achieved. If The 
Scout Report Signpost is a close relative of 
a library collection, the analytical index 
resembles an encyclopedia. Because “us­
ers suffer less from lack of information 
than they do from information overload,” 
an encyclopedic approach may be of 
greater use for basic information needs.23 

Cataloging’s Stepchild 

The best cataloging is a compromise 
somewhere between just putting 
books on shelves and reproducing 
the complete text in an indexed form 
as some hope to do in a online envi­
ronment.24 

Generally speaking, to access informa­
tion from a book, the whole book must 
be in hand. This is not true for a Web site. 
Thus, “the bibliographic structure that 
guides researchers to the location of in­
formation in the print world simply has 
no analogy in the digital realm.”25 To uti­
lize the unique characteristics of a Web 
site, the efforts made in subject and spe­
cial indexes must be taken to the next 
level: locating and organizing the con­
tents of these sites. 

Anything that goes beyond the 
essentials of just getting the materi­
als on the shelf is not worth the time 
and resources. 

The analytic has always been the step­
child in the world of bibliographic control. 
In the ideal, traditional analytic catalog, “a 
record is made for every item, also for ev­
ery document in multi-document items, 
and for every work.”26 Analytics also can 
include elements such as book chapters, 
audio tracks, and art portfolio leaves. Un­
fortunately, the bibliographic record rep­
resenting a single title, series, or serial is 

http:ronment.24
http:needs.23


An Analytical Index to the Internet 591 

the most common unit in library informa­
tion access, and providing access to mul­
tiple works within these packages is a 
luxury that libraries selectively indulge in. 

Analytics in card and online catalogs 
have always been problematic. Accord­
ing to Herbert H. Hoffman, “leaving the 
periodical literature aside, a library of 
100,000 items might contain half a mil­
lion works that are not easily accessible 
to patrons searching the online cata­
log.”27 The ability to access these hid­
den resources has never been fully 
tapped because bibliographic control 
has a long history of being considered a 
costly necessary evil for libraries. Any­
thing that goes beyond the essentials of 
just getting the materials on the shelf is 
not worth the time and resources. Ef­
forts to add access to inaccessible parts 
of a collection are often determined by 
factors beyond the cataloger ’s control. 
Janet Swan Hill asserted that “if the 
sheer bulk of cards to be filed … led to 
the development of arbitrary limitations 
to the number of names and subjects 
that would be traced for any one title, 
imagine how easy is was to reject any 
other practice—such as … cataloging 
the internal contents of works.”28 To fill 
the void, indexes and other types of ref­
erence works are compiled to provide 
access to some types of materials such 
as monographic sets, series, and works 
on micro formats, but these resources 
are only useful if patrons know they 
exist and how to use them. 

Accessibility did not greatly improve 
with early online systems, which were 
little more than automated card catalogs. 
In a May 1994 message posted to the 
AUTOCAT Listserv, Anaclare Evans 
stated: “The elegance of the linking fields 
and analytic records which MARC pro­
vides is contrasted by the inelegance of 
some local systems in handling these 
records.”29 When keyword searching was 
enabled in online catalogs, enhanced tables 
of contents became a popular selling point 
among bibliographic record vendors and 
help provide access to works within 
works. Even with this increased accessi­

bility, patrons still have to access the whole 
package to get to the parts because of the 
rigid characteristics of physical formats. 

It is interesting that Hoffman’s 1998 
study of the different types of analytics 
in a library online catalog concluded that 
“a search for a specific work will: 

1. Retrieve all manifestations of that 
work, 

2. Retrieve only that work, without 
false drops 

3. Not require a second pass, and 
4. Clearly collocate all retrieved work 

titles.”30 

These characteristics are what most 
people expect when searching the Internet. 
Unfortunately, the unfathomable size of the 
Internet, the uneven content, and the lack 
of optimal means to access discrete infor­
mation make this impossible. In fact, it is 
reasonable to believe that “while there may 
have once been a common body of rules to 
govern the creation of catalogs, there is not, 
and is never likely to be, such a body of 
rules to govern the organization of the wide 
range of information available in electronic 
form.”31 The methods of access must be as 
on-the-fly as digital materials themselves 
and flexible enough to stay usable even as 
the digital environment evolves. 

Rethinking Bibliographic Control 

We must ask ourselves what we 
want to see in the future of catalog­
ing and what the acceptable norm 
of organization for the electronic 
world is if not cataloging as usual.32 

Because Web sites are not like books, it 
is reasonable to believe that traditional 
approaches to bibliographic control need 
to be reevaluated for Internet resources. 
This is especially true when attempting a 
greater level of accessibility. Traditional 
cataloging is about creating a representa­
tion of physical objects in a standardized, 
concise manner that can be easily indexed 
in different ways. Catalog cards are distil­
lations of physical objects, plus a location 
device, usually in the form of a call or ac­
cession number. Because cards cannot hop 

http:usual.32


592 College & Research Libraries November 1999 

out of the drawer and lead patrons to the 
physical object, patrons must understand 
how to use the location device. Although 
there is much discussion on the unstable 
nature of the Web, there is probably as 
much chance of a Web site being at the 
other end of a link as finding a book on 
the shelf where its supposed to be. 

If a Web site is treated as a complete 
entity like a book, consistently structured 
bibliographic records are a workable form 
of access with the added advantage that 
the location device points directly to the 
resource. Unlike physical materials, the 
indexer or cataloger is not restricted to 
pointing to the whole Web site in order 
to access a specific part of it. If the 
analytics are carefully chosen and the 
necessary context is described as needed, 
accessing the whole Web site is less im­
portant than taking the researcher directly 
to the informational goal. If the same kind 
of analytic indexing is applied to a physi­
cal collection, the context remains impor­
tant only because the researcher must ac­
cess the larger package before the sought-
after information can be obtained. 

In-depth summaries and descriptions 
are unnecessary when pointing to a digi­
tal object or a single piece of information. 
If the analytic points to a photograph of 
gray whales, the encyclopedic approach is 
to devise as many subject keywords that 
will lead patrons to the resource. It is just 
as quick for researchers to click on the link 
and go straight to the photograph as it is 
to read a description or summary of the 
entry first. This is the main difference be­
tween finding physical objects and digital 
objects. Because physical objects involve 
the step of having to locate the materials, a 
clear description of the resource is neces­
sary to ensure that is it worth further in­
vestigation. In the digital environment, the 
resource is just a link away and research­
ers have the luxury of flipping through po­
tentially relevant Web pages. Elements 
from formats such as the Dublin Core 
Metadata Set should be provided with the 
entry, especially descriptions of the over­
all Web sites and warnings about elements 
that may cause download or access prob­

lems. In general, the subject keywords usu­
ally provide enough information for visi­
tors to decide whether the resource is rel­
evant to their information needs. 

Another major difference between 
physical and digital materials is that cata­
logers are restricted to creating a single, 
logical classification to place physical ma­
terials in a specific physical location. Web 
sites are already “shelved” in cyberspace, 
and because a link to a URL can be labeled 
with text or graphics, there are no limita­
tions to the number of different index en­
tries pointing to a single URL. 

Like an encyclopedia, an analytical 
index should be a combination of a 
place to find quick, concise answers 
or digital objects and a jumping-off 
point for continued research. 

The bibliographic information for digi­
tal analytics must be flexible enough for 
variable Web site designs and content 
presentations. Creating full Dublin Core 
Metadata statements is appealing to cata­
logers used to working with cataloging 
standards, but the effort made for a book 
that will survive for hundreds of years is 
not worth it for a Web site that may not 
last a year and is running on technology 
that will quickly fade into obsolescence. 
Unlike books, “reading and understand­
ing information in digital form requires 
equipment and software, both of which 
are changing constantly and may disap­
pear completely from the market within 
a few years’ time.”33 

In rethinking bibliographic control for 
digital materials, the main unique factor 
to remember is that the bibliographic in­
formation runs on the same technology 
and in the same formats as the resources 
it accesses. Not only is the bibliographic 
information seamlessly linked to the rest 
of the Web, it is prone to the same insta­
bility as all the Internet resources. On the 
other hand, if an effort is made to capture 
the resources for an Intranet or perma­
nent storage, the means of access also can 
be captured. Through “emulation of tech­
nical environment—the use of systems 


An Analytical Index to the Internet 593 

which run in a new operating environ­
ment, but emulate (simulate) a previous, 
now obsolete environment,” the resources 
can be accessed and studied without hav­
ing to re-catalog them.34 This idea of port­
ability is very much in the spirit of an en­
cyclopedia. 

An Analytical Index 
One of the ironies of an analytical index to 
the Internet is that the best format for the 
index is a Web site. The design of this Web 
site can have an impact on the usefulness 
and usability of the index. Factors such as 
ease of use, clear navigation, and straight­
forward instructions are not only impor­
tant for patrons, but also for whoever main­
tains the site. Because it takes so much time 
and effort to find appropriate Internet re­
sources and prepare the appropriate bib­
liographic access, a simple mechanism for 
quickly adding, changing, and deleting 
entries should be the centerpiece of the in­
dex. Various levels of interaction with visi­
tors also can enhance the site. These fea­
tures can range from simple e-mail forms 
for reporting problems and suggesting re­
sources to allowing visitors to fully cus­
tomize and personalize pages for compil­
ing favorite links or saving search sets. The 
most effective index will remain dynamic 
and change as Internet resources and the 
supporting technology change and grow. 

Probably the most difficult part of com­
piling an index of this type is selecting 
the resources. Theoretically, nearly all 
Web sites contain something that a visi­
tor to an analytical index may be search­
ing for. Besides being an impossible task 
to sift through one Web site at a time 
gleaning every discrete bit of information 
and digital object, an attempt at complete 
comprehensiveness undermines the im­
mediate need for such a resource. Like an 
encyclopedia, an analytical index should 
be a combination of a place to find quick, 
concise answers or digital objects and a 
jumping-off point for continued research. 

The difference in the way that printed 
encyclopedias are compiled and an ana­
lytical index to Internet resources is ap­
proached can be an impediment to find­

ing the desired depth of information cov­
erage. The information in encyclopedia 
entries comes from subject experts and 
printed sources in the fields covered. It is 
consolidated into concise, comprehensive 
presentations, often with accompanying 
bibliographies. Indexers of the Internet are 
collecting, not compiling the entries. This 
leaves researchers at the mercy of how the 
author of the Web site approaches a sub­
ject, unless the analytical index includes 
ways of linking related resources together 
and even enhancing each resource with 
added information. Instead of being 
daunted by the effort to add these extra 
values, it is important to keep in mind that 
“packaging and repackaging information 
for the usefulness of our many and varied 
publics has long been the forte of librar­
ians and a specialty of librarianship.”35 

Selecting resources within the context 
of the Web is impossible because the digi­
tal environment is too vast to get even an 
overview of its resources. Although experts 
in different fields can be used to locate and 
evaluate potential index entries, this lack 
of overview makes it impossible to put the 
found information into perspective. Even 
the effort to keep up with new Internet re­
sources is difficult to maintain, and “many 
of the librarian-created projects contain re­
dundant resources, just as efforts by many 
libraries to establish links from their Web 
sites contain a similar core set of sites.”36 

Imagine being let loose in the world’s 
most comprehensive bookstore and re­
ceiving instructions to collect and orga­
nize all the materials on all the disciplines 
covered in its collection development 
policy. The amount of material is over­
whelming, as is the realization that it is 
impossible to gather, catalog, process, and 
house everything. Even a decision to 
choose only the “best” or most compre­
hensive resources does not make the task 
easier because it is impossible to look 
through all the resources before engag­
ing in the selection process. Each choice 
is made based on the quality of the mate­
rial itself and its relation to all the materi­
als evaluated before it. It is a slow, ineffi­
cient process because many items will be 


594 College & Research Libraries November 1999 

reevaluated, chosen, and removed from 
the collection, possibly many times before 
all the materials are sorted. 

If the task of finding the highest-qual­
ity resource when all the choices are physi­
cally present is arduous, trying to grope 
blindly through invisible-until-retrieved 
Internet resources is impossible. The best 
an indexer can do is to evaluate each re­
source as it is found and not worry that six 
other, equally good resources in the index 
contain the same image or piece of infor­
mation while other aspects of a subject are 
not covered at all. The question is whether 
to spend too much time deciding whether 
to include a resource or simply to add it to 
the index and spend the time looking for 
more resources. In an environment as un­
stable as the Internet, the redundant re­
sources may be pulled from the Web within 
a week and a half dozen resources on other 
aspects of the subject may just as suddenly 
pop up. The question of context is just as 
much a problem for a Web site in the digi­
tal universe as for a Web page in a Web 
site, except that there is no way to put the 
Web site completely into context. 

Some aspects of the indexing can be 
automated. For instance, URLs can be au­
tomatically added to existing subject key­
word lists, providing instant access. This 
can even be done at the same time that 
the resource is found by simply cutting 
and pasting the URL into an online form 
available to the indexer. Moreover, other 
elements can be included simultaneously 
in the description of the resource. Unlike 
physical materials, there is no time lapse 
between cataloging and shelving. As soon 
as the resource is submitted to the ana­
lytic index database, it is available to the 
visitor. Conversely, if the content of the 

resource changes or is removed from the 
Internet, the information can be deleted 
from the index as soon as any problems 
or changes are discovered. 

It is unlikely that an analytical index 
can be fully automated because “docu­
ments on the Web are not structured so 
that programs can reliably extract the rou­
tine information that a human indexer 
might find through a cursory inspection. 
Furthermore, the professional indexer can 
describe the components of individual 
pages of all sorts (from text to video) and 
can clarify how those parts fit together 
into a database of information.”37 The best 
solution is to find the perfect balance be­
tween humans and technology. 

Pocket Utopias 
The Internet, and the technology it runs 
on, is not going to remain suspended in 
time, waiting for librarians to devise a sen­
sible means of organizing and accessing 
the resources available on it. Libraries and 
librarians have to be as dynamic and flex­
ible as the Internet in order to utilize it in a 
way that is truly beneficial to patrons. 
Current methods of access to Internet re­
sources have come a long way from “slow 
modems, pokey CPUs, arcane UNIX com­
mands, perverse communications proto­
cols, obscure ftp sites, a text-based inter­
face and similar obstacles,”38 but they have 
not brought access to a consistent level that 
we expect from basic online library cata­
logs. Approaching the resources from an 
analytical viewpoint has the potential of 
increasing the quality of access for certain 
types of information and digital objects 
because “as users are confronted with 
greater volumes of content, the maxim ‘less 
is more’ is increasingly applicable.”39 

Notes 

1. Robert B. Downs and Frances B. Jenkins, ed., Bibliography: Current State and Future Trends 
(Urbana, Ill.: Univ. of Illinois Pr., 1967). 

2. Nick Gall, “Information Everywhere and Not a Drop to Drink,” Network Computing 9 
(Apr. 1, 1998): 99. 

3. Steven R. Harris, “Webliography: The Process of Building Internet Subject Access,” in Acquisi­
tions and Collection Development in the Humanities, ed. Irene Owens (New York: Haworth Pr., 1997), 30. 

4. James R. Veatch, “Insourcing the Web,” American Libraries 30 (Jan. 1999): 64–67. 
5. Peter Suber, “The Database Paradox: Unlimited Information and the False Blessing of 


An Analytical Index to the Internet 595 

‘Objectivity,’” Library Hi Tech 10, no. 4 (1992): 51–57. 
6. Samuel Demas, Peter McDonald, and Gregory Lawrence, “The Internet and Collection 

Development: Mainstreaming Selection of Internet Resources,” Library Resources & Technical Ser­
vices 39 (July 1995): 275–90. 

7. “The Total Librarian,” Economist 340 (Sept. 14, 1996): 12. 
8. Elaine Hoffman, “Untangling the Web of Legislative Histories: A Web Page for Biblio­

graphic Instruction on the Legislative Process,” in The Challenge of Internet Literacy: The Instruc­
tion–Web Convergence, ed. Lyn Elizabeth M. Martin (New York: Haworth Pr., 1997), 120. 

9. Nicholas G. Tomaiuolo and Joan G. Packer, “An Analysis of Internet Search Engines: As­
sessment of over 200 Search Queries,” Computers in Libraries 16 (June 1996): 58–62. 

10. Verlene J. Harrington, “Way Beyond BI: A Look to the Future,” Journal of Academic 
Librarianship 24 (Sept. 1998): 381–86. 

11. Mary Y. Chau, “Finding Order in a Chaotic World: A Model for Organized Research Using 
the World Wide Web,” in The Challenge of Internet Literacy: The Instruction–Web Convergence, ed. 
Lyn Elizabeth M. Martin (New York: Haworth Pr., 1997), 38. 

12. David G. Dodd, “Grass-Roots Cataloging and Classification: Food for Thought from World 
Wide Web Subject-Oriented Hierarchical Lists,” Library Resources & Technical Services 40 (July 
1996): 275–86. 

13. Harris, “Webliography.” 
14. Susan K. Martin and Don L. Bosseau, “Organizing Collections with the Internet: A Vision 

for Access,” Journal of Academic Librarianship 22 (July 1996): 291–92. 
15. Susan Calcari, “The Internet Scout Project,” Library Hi Tech 15, no. 3–4 (1997): 11–18. 
16. Stephen E. Toub, “Adding Value to Internet Collections,” Library Hi Tech 15 (1997): 148–54. 
17. Walt Crawford, “The Card Catalog and Other Digital Controversies,” American Libraries 

30 (Jan. 1999): 53–58. 
18. David P. Goding, “The More Things Change, the More They Stay the Same: The Future of 

the Library and the Library Profession,” in Collection Development: Access in the Virtual Library, ed. 
Maureen Pastine (New York: Haworth Pr., 1997), 19. 

19. Deanna B. Marcum, “Digital Libraries: For Whom? For What?” Journal of Academic 
Librarianship 23 (Mar. 1997): 81–84. 

20. Judith M. Arnold and Elaine Anderson Jayne, “Dangling by a Slender Thread: The Les­
sons and Implications of Teaching the World Wide Web to Freshmen,” Journal of Academic 
Librarianship 24 (Jan. 1998): 43–52. 

21. Goding, “The More Things Change, the More They Stay the Same.” 
22. Clifford Lynch, “Searching the Internet,” Scientific American 276 (Mar. 1997): 52–56. 
23. Marcum, “Digital Libraries.” 
24. Heather S. Miller, “The Little Locksmith: A Cautionary Tale for the Electronic Age,” Jour­

nal of Academic Librarianship 23 (Mar. 1997): 100–107. 
25. Eileen Hitchingham, “Collection Management in Light of Electronic Publishing,” Infor­

mation Technology and Libraries 15 (Mar. 1996): 38–41. 
26. Herbert H. Hoffman and Jeruel L. Magner, “Future Outlook: Better Retrieval through 

Analytic Catalogs,” Journal of Academic Librarianship 11 (July 1985): 151–53. 
27. Herbert H. Hoffman, “Evaluation of Three Record Types for Component Works in Ana­

lytic Online Catalogs,” Library Resources & Technical Services 42 (Oct. 1998): 292–303. 
28. Janet Swan Hill, “The Elephant in the Catalog: Cataloging Animals You Can’t See or Touch,” 

Cataloging & Classification Quarterly 23, no. 1 (1996): 5–25. 
29. Anaclare Evans (1994, May 23). Re: Table of Contents. AUTOCAT [Online]. Available e-

mail: listserv@listserv.acsu.buffalo.edu/Getpost autocat 16384 [1999, March 8]. 
30. Hoffman, “Evaluation of Three Record Types for Component Works in Analytic Online 

Catalogs.” 
31. Norman D. Stevens, “Looking Back at Looking Ahead, or ‘The Catalog of the Future Re­

visited’ with Additional Speculation,” Information Technology and Libraries 17 (Dec. 1998): 188–90. 
32. Ling Hewy Jeng, “A Converging Vision of Cataloging in the Electronic World,” Informa­

tion Technology and Libraries 15 (Dec. 1995): 222–30. 
33. Deanna B. Marcum, “The Preservation of Digital Information,” Journal of Academic 

Librarianship 22 (Nov. 1996): 451–54. 
34. Ibid. 
35. Paige G. Andrew and Linda R. Musser, “Collaborative Design of World Wide Web Pages: 

A Case Study,” Information Technology and Libraries 16 (Mar. 1997): 34–38. 
36. Norman Oder, “Cataloging the Net: Can We Do It?” Library Journal 123 (Oct. 1, 1998): 47–51. 
37. Lynch, “Searching the Internet.” 
38. Goding, “The More Things Change, the More They Stay the Same.” 
39. Toub, “Adding Value to Internet Collections.”