Selection Criteria for Web-Based Resources... Issues in Science and Technology Librarianship Spring 1998 DOI:10.5062/F44J0C3R URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed. Selection Criteria for Web-Based Resources in a Science and Technology Library Collection Robert B. McGeachin Agriculture Reference Librarian West Campus Library Texas A&M University College Station, TX 77843-5001 r-mcgeachin@tamu.edu Abstract This article discusses reasons to include Internet-based resources in a science and technology library collection. In addition to the normal collection development criteria for print materials, many additional factors unique to electronic resources are examined. These topics include: additional hardware and software requirements, user access regulation, funding the cost of Internet materials, user education and assistance, stability of Internet location, long term access, product licensing, cataloging of Internet materials, and delays in access due to heavy Internet traffic. Introduction The growing popularity of the Internet as a source for information resources leads to the need to establish and apply rational collection development criteria to the acquisition of Internet materials. There are several reasons to develop collections of Internet resources for libraries. Internet resources can be very convenient for users since materials are delivered quickly. They are delivered directly to a computer station without a patron having to collect materials from around the library building. When such materials can be delivered to a patron's home or office computer, convenience is increased even more, and is likely to result in greater user satisfaction. If a library can provide access from the user's desktop, in effect patrons have year-round, 24-hour access to materials. Electronic media can often provide materials much faster than print publications. For example, journals and periodicals can release editions electronically that are available instantaneously, compared to the days it takes print to arrive by mail. Also some periodicals release their electronic versions before their printed versions, making them even more timely. For time-critical information this is of great benefit to the users. As more materials are provided on the Internet, users' expectations increase rapidly, often in excess of what the market or libraries can provide. When a library, university system, or consortium, acquires a web-based product, it essentially provides multiple copies since the product can be widely distributed to a very large customer base. If the system components are spread over a large geographical area, this can provide much easier remote access to the materials. Increasing interest in distance education seems to be a trend at many educational institutions, and as a result, the need to supply library materials to distant locations is growing. Providing Internet access to full text materials for remote users is one solution to this service issue. As many science and technology (S&T) libraries are either small branch or special libraries with limited shelf space available, the use of electronically stored materials that are usually remote to the site is a means of saving space. These space savings can be in the form of print shelf space or user space if patrons are remotely accessing the library's Internet resources. Economic forces influence some publishers to switch from print to Internet-based publishing. The perception that it is less costly to produce and distribute electronic materials instead of printed versions is making many federal and state agencies move from print to Internet-based publications. For example, many state Agricultural Experiment Station and Extension Service publications are now published only electronically and are accessible only on the Internet. The Government Printing Office is gradually moving to Internet versions of most of its publications. To retain access to some information sources libraries have no choice but to provide Internet-based materials. But these cost savings to publishers do not usually result in comparable savings to libraries as many sources are becoming subscription-based at rates comparable or higher than print, and libraries must absorb the added cost of providing and maintaining computer hardware to access the Internet resources. Selection Criteria Librarians should use the same collection development criteria with Internet information as they traditionally have with print resources. The American Library Association Reference and User Services Association Collection Development and Evaluation Section's (ALA RUSA CODES) Collection Policy Committee (Fedunok 1997) has published a checklist of criteria for electronic information. That the scope of Internet materials fits the information needs of the library's clients is paramount. It must be of the proper intellectual level, be accurate and presented in an easily accessible manner. It should have good intellectual coverage of the subject. The materials should be current and updated when necessary. Ideally, the electronic location should be stable to allow for easy continued access. The materials should be available for the amount of time that they are useful to the clientele. Internet-based materials that fit the needs of users of science and technology libraries include electronic journals and magazines, books and reference books, statistical sources, and databases. There are hundreds of S&T journals available electronically, most of them by way of the Internet. There are a few published by scholarly societies, academic, or government agencies that are free of charge. But most are commercially published by for-profit or society publishers and are subscription-based. Some are free with a print subscription, but most are an additional 7.5% to 25% above the cost of a print subscription when the library also subscribes to the print edition. Choosing to subscribe to an Internet version alone usually saves only 20% or less (Anonymous 1997). As with any materials, libraries must examine the cost/benefit ratio, in this case factoring in the accessibility and convenience of Internet periodicals being delivered directly to the users. In addition to individual Internet journal subscriptions, a number of vendors and publishers make Internet subscription packages of journal titles available such as EBSCOhost's Academic Search FullTEXT, Academic Press' IDEAL, and Elsevier's ScienceDirect. These commercial vendors may also act to aggregate journal collections, offering an interface to collections of other publishers. Potential price advantages to such arrangements include discounted prices and greater ease of use since each title has a common interface. In addition to periodicals, a rapidly growing body of books and reference materials are becoming available over the Internet. Many standard reference titles that first appeared electronically as CD-ROMs have now been replaced by Internet versions. In the category of commercial reference titles adapted from print editions, most are subscription based. Publishers of subscription-based materials usually offer a free trial for a limited time or make available limited examples on the Internet for customers to examine. Some that are in the public domain are freely available. Another common means of commercial support is for Internet sites to include paid advertisements within the web sites to support free access to the users. All the usual types of reference materials are available: dictionaries, thesauri, encyclopedias, directories, guides and handbooks. The following is a list of examples: Dictionary of Cell Biology {http://www.mblab.gla.ac.uk/~julian/Dict.html} Roget's Thesaurus {http://www.roget.org/} Encyclopaedia Britannica http://www.eb.com/ BigBook http://www.bigbook.com/ Handbook of Space Astronomy & Astrophysics {http://ads.harvard.edu/books/hsaa/} Nutrient Requirements of Beef Cattle: Seventh Revised Edition {http://www.nap.edu/catalog.php?record_id=9791#toc} Internet electronic materials have the advantage of supporting new information formats and new types of interaction with users. For example, a phone directory may include the ability to see a map of the location found and give driving instructions on how to get there from almost any place in the country. Another example is the ability to look at real-time weather data with maps showing the distribution and movement of clouds, rain, wind, and temperature. Multimedia encyclopedias and handbooks can include images, audio, and video to enhance and accompany the text. Many Internet chemical materials now include the option to view and interactively rotate and examine chemical compound images. Statistical materials are available on the Internet. Some examples include free, time-delayed stock tickers supported by advertising; agricultural production, trade and consumption figures from government or international agencies; and historical climate and weather data from government agencies. Access to databases of information may also be a valuable addition to the S&T library collection. One of the earliest and largest sources of freely available information over the Internet is the genetic sequence database. Genetic sequence databases for humans and many other organisms are used extensively by scientists in biotechnology and molecular biology research. Long sequences of letters (T, C, A, & Gs) representing genetic codes which make with no obvious sense to human readers, but which require accuracy in duplication, are managed best by electronic transmission of data files. Pages and pages of code could not be typeset or transcribed by humans without introducing copying errors. Thus electronic data file transmission was established as the standard mechanism for their storage and transmission. Originally set up as FTP sites, these genetic data banks have now migrated to web sites that let browser software handle the file transfers in a more user-friendly manner. Genetic map image files are also now a common feature at these sites. Many other types of scientific data such as chemical compound and safety information, consumer information, and engineering and agricultural extension data are freely available. Subscriptions to scientific bibliographic databases delivered via the Internet are the most popular commercial web application in S&T libraries. Implementation Criteria Unlike print collections, librarians must consider whether their users are technologically equipped to use an Internet resource. Use of the Internet requires the additional costs of computer equipment and software as well as the ability to secure, maintain, and upgrade the equipment. Library staff training and user instruction also require additional resources. As most libraries have moved to online catalogs some amount of computer equipment is already being used, but the Internet and its resources require personal computers (PCs) with client capabilities that may exceed the capacity of equipment that is already in use. The highly graphical and rapidly growing multimedia capabilities of scientific web sites may force the need for S&T libraries to maintain their computing stations near the high end of the technology curve. The need to limit access to subscription Internet materials to authorized, affiliated users requires some mechanism for valid user verification. One common security measure is to supply the commercial service with a list of acceptable Internet Protocol (IP) addresses that represent the domain of an institution. Some vendors also require logon with a user ID and password. Two possible methods to distribute logon identification to the library's users are to set up a gateway web page that lists the ID and password next to a link to the resource or to set up a gateway web page with a link to the resource that contains an embedded CGI script that seamlessly transmits the ID and password for the users. In both of these cases an IP address check is still also required to prevent unauthorized access. IP address checking can create a problem for users trying to access Internet materials from home if they use a commercial Internet Service Provider (ISP) that would give them a non-approved IP address. One method to solve this is for the library to set up a gateway proxy server that requires a separate valid user logon ID and password that then gives the user a valid temporary IP address. Because of the uncertainties of how well current copyright laws will protect electronic products from unauthorized distribution that would cut into potential revenues, most publishers and vendors rely on contracts and product licenses to restrict and control access to their electronic products. In addition to distribution and access concerns, the need to delineate the pricing scheme, which may be very complex compared to a simple one-time purchase or typical subscription for print, to arrange for archival access, and to satisfy other new concerns unique to electronic resources encourage the use of product licensing. No standard for such licenses yet exists, and each product requires examination of its license and possible contract negotiation before acquiring a resource. The librarian must determine that the terms of a license are acceptable with respect to limiting access for users or to permitted uses of the material such as printing or downloading. For example, if a resource is being "site licensed" for an institution, what is the vendor's definition of the institution's site? The license may define it as a certain geographic region in a particular city and a specified number of miles around that location. This may be acceptable for the majority of the users but would exclude most distance education users, users at remote institutional experimental facilities, or corporate users in another city. This instance might require negotiation of a new broader site definition to include legitimate remote users. Contract negotiation is a new area of expertise for most librarians to develop. But in most institutions or companies, the librarians can not make the final contract negotiations or sign contracts, as this is a process usually reserved for the institution's legal counsel. The librarians should participate in establishing workable terms by maintaining contact with the legal staff and apprising them of critical aspects on which to hold firm in the negotiations. This negotiation and licensing process can significantly delay acquisition of a resource. Acquiring web browser client software is generally not a hurdle as Netscape's Navigator or Communicator software has always been freely available to educational institutions such as academic and public libraries, and Microsoft's Internet Explorer software or NCSA's Mosaic software are free for anyone to download and use. As a result, most users either have basic client software or they can readily obtain it from a download site. Some Internet applications, especially those with interactive multimedia features such as chemical molecule viewing, require the use of "helper application" software which must be downloaded and installed. This could require extra setup time for library computers and instruction for remote users on downloading and helper software installation. The advent of Java-capable browsers and application loading and setup on the fly by Java-based applications may alleviate this potential consideration. Internet materials can be mounted on either a local UNIX-based web server or on a remote server. The advantage of locally mounted materials is generally faster access times for local users, but this is offset by the expense to the library of the server hardware, software and their maintenance and administration. Problems may arise with Internet materials on remote servers due to access delays resulting from the amount of traffic on the remote server or on the intervening Internet trunk lines, but the library does not have local server costs. The cost of Internet based materials can be handled in a number of ways. To some degree the cost of print resources can be traded for the cost of replacement Internet materials. Especially in smaller corporate special and academic branch S&T libraries, the savings from the reduced need for additional shelf space can be significant. But, for most libraries, the overall cost of Internet resources results in a net increase in equipment and materials expenditures so that new sources of funds must be realized. In the academic environment, the implementation of library use fees, or an increase in existing fees, can be used to fund new Internet resources. If the student users can be educated about the advantages of Internet delivery of information resources such as availability of full text journal articles and convenience of remote access, then they may be less resistant to increases in fees. Another option is a fund raising campaign to set up an endowment for the increased electronic resources. This idea can be sold to alumni on the prestige of using new technology and to corporate donors on better preparing the graduates to effectively and efficiently contribute in the corporate world. The challenge of user education and communication about Internet resources needs careful attention. For the library to create web-based help pages or frequently asked question (FAQ) pages, there may be costs to consider. Most commercial subscription resources have some form of help pages, and their clarity and effectiveness should be among selection criteria. In some cases a library can adapt or link to help pages produced by another library that are superior to those of a vendor, saving on local development costs. Publishing the phone number of the library's reference desk to answer Internet questions is another means of user assistance. Putting an e-mail link on the library's web page to receive e-mail questions is also an option. Long Term Access Criteria The first access criterion for any Internet resource is the stability of its URL address, including the ability to continue to find it over time. As web sites' host servers and the resulting address of materials tend to change over time on the Internet, the degree of address stability is an important consideration in material selection. The current volatile economic situation for commercial Internet resources means many sites either move, cease to exist, or languish unchanged, no longer current after their initial creation. The reputation of a commercial publisher or governmental agency and the associated stability of site addresses should be a factor in Internet material selection. Sites by individuals may be less reliable and frequently do not have much longevity unless they become a commercial enterprise for the individual. There are exceptions to this among web sites created by librarians and academic faculty. Quality sites that move will leave a place-holding page at an old address with a notice of the change and a link to the current address for a period of time to allow users and libraries a chance to change their menus, bookmarks, and electronic catalog records. Collection of Internet materials requires periodic monitoring of site addresses due to their tendency to change. From the author's experience, checking addresses at least once every two weeks is a good rule of thumb. This requires library staff time, although link-checking software can partially automate the process. The replacement of print materials by Internet access creates a longevity of access criterion. If materials are subscription based, what is the time period for access by the library and its users? With a print subscription, long term access is assumed, but for Internet access the time must be determined. For example with a journal subscription, is access only for the period of the subscription or are perpetual access rights granted for any publishing time period when a subscription is in force? If "perpetual" access is guaranteed, how capable is the publisher to uphold this promise? One must examine just what long term access is being guaranteed and in what format. Given the rapidly changing nature of the Internet electronic environment, does the provider specify that its materials will all migrate to future forms of electronic accessibility? What economic incentive does a provider have to uphold such promises? World Wide Wait Unfortunately, at the present time, the Internet frequently becomes the "World Wide Wait" due to slow response time on busy servers or high volume traffic jams on network transmission lines or at hub intersections. At peak afternoon and evening usage hours some Internet resources become virtually unusable due to slow response time waiting to connect to overburdened servers or just waiting for the information packets to traverse jammed network lines. One possible solution is to put a copy of frequently used materials on a local server. For large universities or consortia with many users, some vendors are quite happy to make arrangements for local data loading since it reduces the demand on their servers. The increased carrying capacity of the "Internet II" should aid in reducing delivery delays to the users of member institutions. Another course of action is to notify vendors when their servers are responding slowly due to high numbers of users and encourage them to add more server capacity. If vendors are unresponsive to such complaints and service remains poor, then libraries should take their business to another source. Some vendors are making more use of Java applications to decrease the amount of data bits they must transmit to users by supplying information and applications only when needed instead of having pages that are very load-intensive initially. Cataloging Internet Materials The addition of the 856 field to MARC records has provided a means to integrate Internet materials into the online catalog. Depending on the online catalog software in use, actual functional hyperlinks can be included in the catalog display, or, if not Internet compatible, the URL can be listed in the catalog display. The use of Internet materials raises some questions about local cataloging practices. Will the library include remote sites of value but in no way connected to their institution, and , if so, who will monitor the addresses for continued accuracy? If a large collection of many titles is purchased from a single vendor with one interface and web address, will each title have a separate catalog entry in addition to the main entry for the compiling service? If both a print and Internet version of the same title are available, will this be reflected in a single catalog entry or will separate records be created for each format? If periodicals are part of the library's Internet collection and the online catalog displays volume holdings for print periodicals, then should the catalog also display volume holdings for the Internet version? Conclusions Science and technology libraries should justify the inclusion of Internet resources in their collection with criteria as rigorous as for print materials. The same standards of accuracy, authority, coverage, currency and objectivity need to be applied to Internet resources as to any other medium. Before subscribing to an Internet information provider, throughly investigate their products, consult with peer libraries for evidence of satisfaction, and test the products under local conditions to determine potential problems. Carefully plan the implementation of new products, making sure institutional resources are sufficient to support them. Make sure that adequate user assistance is in place at the time of public implementation. If local web pages are being created as gateways to resources, test them fully from several locations on different types of computers and with different browsers to ensure that they behave and display as intended. After subscribing to Internet vendors maintain a dialog with them, giving them both positive and negative feedback on service issues. When replacing print materials with Internet access make plans for long term access as well. Acknowledgements The author wishes to thank Leila M. Payne and Jane A. Dodd for critical review of the manuscript. References: Anonymous. 1997. Libraries Join Forces on Journal Prices. Science 278(5343):1558. Fedunok, S. 1997. Hammurabi and the Electronic Age: Documenting Electronic Collection Decisions. RQ 36(1):86-90.