College and Research Libraries Evaluating Reference Service in a Large Academic Library Cheryl Elzy, Alan Nourie, F. W. Lancaster, and Kurt M. Joseph An unobtrusive study of the ability of professional librarians to deal with factual questions was conducted at the Milner Library, Illinois State University. Students were recruited to pose questions for which answers were known, to 19 librarians in five departments. In all, 190 test "incidents" (10 questions for each of the 19librarians) were used. Librarians were evaluated on the accuracy of the responses given and on their responsiveness and helpfulness, as judged by the student proxies. The methods used in the study are described, including the accuracy and attitude scales developed, the major results are presented, and suggestions are made on the follow-up action that seems appropriate after a study of this kind has been performed. everal investigations have sug- gested that reference librarians may provide complete and _ correct answers to factual questions only about half the time.1 Con- cern over this disappointing performance and questions about its applicability to reference service at Illinois State Univer- sity prompted this study. Illinois State University (ISU), a multi- purpose university of more than 22,000 students, offers 191 degree programs in 33 academic departments organized into five colleges. Master's degrees are of- fered in most areas and doctorates in nine. Milner Library, the central library facility, is organized into five subject di- visions on six floors: Education/Psychol- ogy, General Reference and Information, Social Sciences/Business, Science/Gov- ernment Publications, and Humani- ties/Special Collections. The five divi- sions are staffed by 20 members of the library faculty, 19 classified employees, and student assistants. OBJECTIVES The objectives of the study were (1) to estimate the probability that a user, walking into the library with a factual question, would receive or be led to a complete and correct answer, (2) to de- termine to what extent student users of the library judge staff members to be responsive and helpful, (3) to identify conditions under which members of the reference staff perform well and condi- tions under which they perform poorly, and thus (4) to identify ways in which the service might be improved. In other words, the focus of attention would be on the accuracy with which factual ques- Cheryl Elzy is Head of the Education and Psychology Division, and Alan Nourie is Associate University Librarian for Public Services and Collection Development at Milner Library, Illinois State University. F. W. Lancaster is Professor at the Graduate School of Library and Information Science, University of Illinois-Urbana-Champaign, Urbana, Illinois 61801. Kurt M. Joseph is a Graduate Assistant in the Department of Psychology at Illinois State University, Normal, Illinois 61761. The study described was supported in part by the Council on Library Resources through its Cooperative Research Grant Program. 454 tions are answered by or with the aid of library staff. While we would agree with such writers as Jo Bell Whitlatch that success in answering this type of ques- tion is not the only criterion by which an academic reference service should be judged, we have little sympathy with others who state or imply that accuracy is of little concern to library users, who are more concerned with such things as convenience, timeliness, and the librarian's attitude.2.3 The study was not performed as an academic exercise. No hypotheses were formulated. Our main concerns were to get a better idea of the quality of refer- ence service at ISU and, in particular, to identify possible problem areas. METHODS The following decisions were made at the outset: 1. The evaluation should be performed unobtrusively. Questions, of the type that might reasonably be put to the various departments of the Milner Library, would be collected and stu- dents recruited to pose them to members of the library faculty as though these questions represented their actual information needs. 2. Faculty members would be evalu- ated on attitudinal characteristics and on whether or not they were able to supply complete and correct answers. 3. The study would be conducted be- tween April17 and April24, 1989 to allow sufficient time to recruit and hire students. It would occur dur- ing a peak time of activity in the academic calendar-midway be- tween spring break and the end of the semester. Group and individual training sessions for student prox- ies would be held one week prior to the study. There would be one week of class and one week of finals dur- ing which proxies could pose their test questions if they somehow failed to do so during the test pe- riod. Unobtrusive studies of reference ser- vice have been performed several times Evaluating Reference Service 455 in the last 20 years, and they have been reviewed and evaluated elsewhere.4 The present study differs from most earlier ones in several ways: 1. In most unobtrusive studies the ques- tions have been posed by telephone rather than by personal visit to the library. 2. Most studies have been performed in order to compare libraries, and perhaps to identify broad catego- ries of factors that might influence reference performance, rather than to derive detailed data on a single library. 3. This study, involving 190 reference transactions, appears to be the larg- est unobtrusive study of reference service yet to be attempted within a single library. The director of the library requested that a memorandum be sent to all public service librarians stating that the study would be performed some time during the coming year and that the results would not affect their annual performance evalti- ations. This was done in January 1989. A few individuals expressed reservations, but after they learned more about the study and its background and intent, their concerns were overcome. An evaluation form was created for students to complete after each test ques- tion had been posed. It recorded the test questions and the answers given by li- brarians; it also provided space for ob- servations on the attitude and demeanor of the librarians. Each item was followed by a request for open comments from the students. It was decided that at least 10 ques- tions should be posed to each librarian. More would be desirable, but probably would be too unwieldy. Fewer might not provide a true picture of that librarian's attitude and skills. Therefore, 190 inci- dents would be recorded by the stu- dents-10 questions for each of 19 librarians. Recruiting students for an unobtru- sive study is difficult because research- ers cannot simply advertise in the student newspaper or on the bulletin board and still keep the project confiden- 456 College & Research Libraries tial. Some contacts were made through professors known to the researchers at ISU and also at Illinois Wesleyan Univer- sity across town. They were provided a job announcement describing the project and the responsibilities required of stu- dents. Some students did apply from both institutions through these personal contacts. In the end, however, the best source for recruitment proved to be Mil- ner Library's own records of students who had applied for positions, but could not be hired for administrative or finan- cial reasons. The person at Milner who is in charge of student employees provided files on 18 ISU students who might be suitable candidates, and these were con- tacted. Together with two students from Wesleyan, they formed the group of prox- ies. They were to be paid $75 each for approximately 10 to 15 hours of work, including attending a training session, being interviewed individually by the investigators, asking the questions, fill- ing out the evaluation forms and attend- ing a follow-up, or debriefing, session. All 20 students were undergraduates: 7 males and 13 females; 4 freshmen, 4 sophomores, 8 juniors, and 4 seniors. Ages ranged from 17 to 23. Although the students came from a wide variety of disciplines, no attempt was made to match reference questions to student majors. The most difficult and time-con- suming task was creating the questions to be used. The researchers put together an initial pool of hundreds of questions gathered from reference texts, other ref- erence studies, and the investigators' own backgrounds and experiences.5 Most were rejected on the first reading. Approximately 150 different questions were selected for research in the Milner Library collection and evaluated for pos- sible use. All questions that could not be answered from Milner's collection were eliminated. Those of a type or subject not usually asked in this library were also discarded. From those that remained, questions were matched by subject and specialty with appropriate floors and li- brarians, and 58 questions were finally selected. Most were asked more than once. For example, many questions were September 1991 asked of each librarian on a particular floor because librarians are responsible not only for reference work in their own subject areas, but for all disciplines housed on their floor (e.g., the music librarian has to answer questions in music, fine arts, literature, and languages). The same question was asked on different floors where it was appropriate to do so. Through repetition of questions, 190 test incidents were created. The set of questions used differed from that of most earlier studies in that it contained a blend of factual questions (e.g., Who was the secretary of state when Sumner Wells was his assistant?) and of questions that were more research oriented (e.g., I need some articles dis- cussing the short story entitled "The Lot- tery"). Because all questions could be answered from the resources of the Mil- ner Library, the study was really an eval- uation of the librarians' ability to exploit the library's resources. Simply recognizing that one may be perceived in a certain way by a patron ... might be enough to solve the problem. The scheduling of when a student was to pose a question was essential to the project because the student was to seek a particular librarian, identified by desk nameplate. This proved difficult with so many people and questions involved. The researchers wanted to avoid ques- tions being asked twice of the same li- brarian. Another goal was to spread the questions out evenly during mornings, afternoons, and evenings, and through the several days of the study. Librarians work the reference desks only at certain times, and students had to try to match their schedules with those of the librari- ans. The researchers also wanted to spread the student proxies out among the five floors and over the evaluation period so that they would not become too familiar to anyone on a given floor and in any time frame. The questions were laid out against the floors with the librarians' schedules, and 20 question sets of 8 to 10 questions each were created. The week before the actual study, group training sessions were organized for the student proxies. The background for the project was discussed, as was how the study would be conducted and how the results would be used. Packets were given to each stUdent with a sched- ule of when and to whom each question should be asked, along with a list of questions and an evaluation form for each. The forms were explained in great detail. Students were encouraged to complete the forms immediately after the encounter so that information and impressions would be fresh in their minds, and comments were strongly encouraged. Finally, their pay, the time frame, and other require- ments were discussed. An individual training session was set up with each proxy over the next three days to assist each one in understanding the project, questions, and forms, and to help each construct a "cover story" for the ques- tions in the event of an extended refer- ence interview. A number of unexpected problems oc- curred during the study itself, such as librarians being on vacation or out sick, no nameplates on desks, making identi- fication difficult, and so on. But only about 14 questions had to be carried into a third week; all 190 forms were re- turned, and the students performed dil- igently. A follow-up session was held with the students a few days after the evaluation period. They were eager to share their thoughts and experiences. Comments ranged from how the time of day affected how librarians handled questions to the perception that the bet- ter the students dressed, the better the service received. The work of verifying and grading an- swers, and of summarizing the mounds of data collected, continued through summer and fall. The investigators from Milner rechecked each answer on each floor and scored it. The attitudinal scores from each form were compiled, and comments on each librarian recorded. Each individual student, librarian, floor, and question was given a unique code, Evaluating Reference Service 457 and the SPSS program was used to ma- nipulate the data. Demographics on each student were entered, as well as information about each of the 190 inci- dents (e.g., time of day and date the question was asked, minutes spent with the librarian, and so on). RESULTS Each student was asked to record the details of the answer and search results provided by the librarian for each ques- tion posed and also to supply informa- tion on the source of the answer-title, edition, page number, and so on. Also requested were the date and time of day the question was posed and how many minutes were spent in contact with the librarian. The researchers checked each answer in the 190 cases for completeness and correctness and noted sources used by the librarians. Each incident was as- signed a code reflecting the relative suc- cess of the librarian in answering the question (see table 1). Arriving at a viable scoring procedure was difficult. It is relatively easy to score the results of a question posed to a li- brary by telephone. The library can be considered as simply a ''black box" and the librarian scored on a binary scale- giving the correct answer or not. For a walk-in to an academic library, scoring is more complicated. It was decided to score from the viewpoint of the student user. Because we were evaluating the role of the library in answering questions rather than in bibliographic instruction, we decided that the best possible result was one in which the user was given the complete and correct answer. Anything less should receive a lower score. Being led to appropriate sources by the librarian was judged less satisfactory than receiving a correct answer, but being led to sources was judged better than being pointed to them. It was also thought that being led or pointed to several sources, one of which included the complete and correct an- swer, was less satisfactory than being led or pointed to only one source that con- tained the complete and correct answer. Finally, the worst result was one in which the user finished with an incorrect 458 College & Research Libraries September 1991 TABLE1 SCORING METHOD USED Points Student provided with complete and correct answer . . . . . . . . . . . . . . . . . 15 Student led to a single source, which provided complete and correct answer . . . 14 Student led to several sources, at least one of which provided complete and correct answer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Student directed to a single source, which provided complete and correct answer 12 Student direced to several sources, at least one of which provided complete and correct answer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Student given an appropriate referral to a specific person or source, which provided complete and correct answer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Student provided with partial answer . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Student given an appropriate referral to the card catalog or another floor . . . . . . . 8 Librarian did not find an answer or suggest an alternative source . . . . . . . . . . 5 Student given an inappropriate referral to catalog, floor, or source, or librarian unlikely to provide complete and correct answer . . . . . . . . . . . . . . . . . 3 Student given inappropriate sources Student given incorrect answer . answer. These principles are reflected in the scoring method used. Some findings from earlier investiga- tions support the scoring method used in this study. Wyma J. Hood and Monte J. Gittings found that librarians accom- panied users about 54% of the time when answering reference questions in one ac- ademic library.6 About 46% of the time, they showed them how to find the an- swer, and about 34% of the time, they found the answer for the user. Thomas Childers, in an unobtrusive study of ref- erence service in public libraries, found that when proxies were directed to li- brary tools, such as indexes or catalogs, they were not accompanied by the librar- ian in more than half the cases. 7 When directed to browse through shelves, books, chapters, or articles to find answers to their questions, they were not accompa- nied by the librarian in about 80% (129 out of 159) of the cases. Charles A. Bunge found that library users received correct answers only 47% of the time when merely directed to appropriate sources by a busy librarian. This improved to 59% when a busy librarian helped the users search and to 65% when a librarian 2 0 not otherwise busy helped the patrons search.8 The scoring method we adopted served our purposes, but it can hardly be consid- ered definitive, and other researchers might well disagree with it. Its major limitation is that, to avoid complications in data processing, a zero was assigned to the situation in which a student re- ceived an incorrect answer when, in fact, it would make much more sense to give this situation a minus score. If a zero is associated with an incorrect answer, it is necessary to give certain other unsatis- factory outcomes (e.g., an inappropriate referral) a positive value, because they are considered somewhat less heinous, but this hardly seems logical. In fact, a more logical scoring procedure would probably assign a zero to the situation in which no answer was provided (which scored 5 in table 1) and would give minus values to inappropriate referrals and to incorrect answers. The attitude scale, in contrast to the accuracy scale, was a very simple one. The students rated the librarians on their helpfulness and approachability on a continuous scale of 0 to 10 in answer to 24 questions (e.g., Looks approachable? Acknowledges user's approach to desk? Friendly attitude?). The attitude value for a librarian or a floor is merely the mean of all values recorded by the stu- dents for that librarian or floor for all24 questions. The scores for the 190 reference inci- dents are summarized in table 2. In 58 cases, the librarian received a score of 15, meaning that he or she provided a com- plete and correct answer. However, the philosophy of service in academic librar- ies often is to provide the appropriate sources or point the student in the right direction. If that level of service is ac- cepted as adequate, then scores of 10 or abov.e would be considered acceptable. This was true of 111 cases (58% of all incidents). If an appropriate referral to the card catalog or another floor is con- sidered acceptable, the library's score in- creases to 121 cases (about 64% of all incidents). It goes to 128 cases (about 67% of the total) if partial answers are considered acceptable. In)8 of 190 cases (9.5%), the librarian could not or did not find an answer. In 36 (19%) instances, Evaluating Reference Service 459 inappropriate referrals and sources or outright wrong answers were given. The time of day the question was asked was coded to determine whether it seemed to affect the accuracy or the behavior of librarians. Accuracy decreased during evening hours, but the decrement was not statistically significant. Time of day also had no significant effect on how well the students perceived they had been treated (attitudinal scores). Table 3 shows how often each of the 58 questions was posed, the accuracy of scores achieved (mean score for ques- tions posed more than once), and the mean time spent per question. Because of minor mix-ups, questions 7 and 17 were not asked. In a few other cases, students failed to record needed data- particularly minutes spent with the li- brarian-or sufficient information on answers to allow the investigators to ver- ify them. The number of these cases is noted in parentheses in table 3. The frequency with which a question was asked ranged from 1 to 8 times. Ac- curacy ranged from 2.3333 on question 43, to 8 total questions scoring 15, and 7 TABLE2 ACCURACY OF ANSWERS PROVIDED Answer Code Frequency Percent 15 58 30.5 14 24 12.6 13 13 6.8 12 5 2.6 11 8 4.2 10 3 1.6 9 7 3.7 8 10 5.3 5 18 9.5 3 10 5.3 2 16 8.4 0 10 5.3 Missing"' 8 4.2 190 100.0 "'Some students failed to provide enough information upon which to base judgments, or asked the question in such a way as to change the expected response, thus invalidating the question. 460 College & Research Libraries September 1991 TABLE3 QUESTION BY QUESTION RESULTS Mean #of Minutes Question Times Posed Accuracy Spent on Question 2 12.0000 13.5 2 2 13.0000 5.0 3 2 7.5000 4.0 4 2(1)* 15.0000 3.25 5 2 14.0000 6.5 6 4(1)t 5.5000 6.0 7 8 2 8.0000 9.0 9 5 10.2000 4.2 10 2 14.0000 3.0 11 4 9.7500 4.2 12 4 13.2500 8.0 13 2(1)* 14.0000 5.0 14 2 15.0000 3.5 15 2 11.5000 3.0 16 5(1)t 5.2500 4.3 17 18 3 10.5000 15.0 19 5(1)t 10.4000 2.5 20 7 9.7143 8.63 21 4(1)t 13.0000 4.66 22 5 5.8000 6.4 23 7 10.1667 4.93 24 2 8.5000 2.0 25 1 11.0000 2.0 26 6 15.0000 6.66 27 3 11.3333 5.83 28 4 3.4000 4.0 29 4 7.2500 ·2.5 30 4 7.6667 9.0 31 1 15.0000 12.5 32 3 9.3333 5.6 33 2(2)* 6.0 34 3 15.0000 11.66 35 2 15.0000 7.5 36 · 3 15.0000 9.3 37 3(1)* 5.0000 6.3 Question Times Posed 38 5 39 2 40 3 41 4 42 3 43 3 44 3 45 3 46 2 47 3 48 7 49 8(1)* 50 3 51 6 52 3 53 3(1)* 54 3 55 7 56 2 57 3 58 "' Missing data in accuracy code. t Missing data in minutes spent. scoring 14 on the accuracy scale. Minutes spent went from a low of 0 (the librarian did not leave his or her office, but con- ducted the transaction through the stu- dent assistant covering the desk) to a high of 25 minutes. Mean actual minutes spent went from 2 on questions 24 and 25, to 15 on question 18. There was no appreciable correlation between min- utes spent with students and accuracy (r = .1008, n = 179, p > .09). Thus, the accu- racy variable was collapsed into three categories for further analysis (15 to 10 as acceptable, 9 to 6 as minimal, 5 to 0 as unacceptable), but there was still no re- lationship between minutes spent and accuracy (r = .0299, n = 179, p = .346). Time spent with the student did notre- flect how complete and correct the an- swer was. Evaluating Reference Service 461 Mean #of Minutes Accuracy Spent on Question 11.6000 6.9 5.0000 12.5 4.3333 5.3 7.5000 5.0 10.3333 7.3 2.3333 7.83 7.0000 5.66 14.3333 3.0 8.0000 5.25 6.3333 9.33 11.5714 9.86 7.1429 3.69 15.0000 4.83 12.3333 5.25 10.0000 3.1 14.0000 4.3 6.3333 6.83 9.5714 6.43 14.0000 7.5 12.0000 3.0 14.0000 5.0 Table 4 shows the accuracy and attitude scores achieved by floors or di- visions. Floor E achieved both the lowest accuracy figure and the lowest attitude scores. Floor C received highest marks for attitude, while floor B was most ac- curate. Table 5 records the attitude and accu- racy scores by librarian, along with the number of questions each librarian was asked and the minutes the librarian spent with the students. Attitude scores ranged from 8.75 on a 10-point scale, down to 5.74. Accuracy ranged from 13.889 out of 15, down to 7.125. While the librarian who scored highest in attitude also received highest marks in accuracy, the same was not true of the lowest scores in each category. Both accuracy and attitudinal scores are discriminat- 462 College & Research Libraries September 1991 TABLE4 ACCURACY AND ATTITUDE SCORES BY FLOOR Floor Questions Accuracy Attitude A 30(3)* 10.4074 8.2100 B 30 12.7333 8.2067 c 20(2)* 11.7778 8.5200 D 71(2)* 9.6377 7.7141 E 39(1)* 8.1053 7.1256 Mean 190(8)* 10.1538 7.8342 * Missing data for accuracy scores. TABLES ACCURACY AND ATTITUDE SCORES FOR EACH LIBRARIAN Number of Librarian Questions Asked Attitude Accuracy Mean Minutes Spent 1 10(1)* 2 10 3 10 4 9(1)* 5 10(1)* 6 10 7 10 8 10 9 10(1)* 10 10 11 10(1)* 12 10(1)* 13 10(1)* 14 10 15 10 16 12 17 10 18 9 19 10(1)* Mean 190(8)* * Missing data for accuracy scores. ing. One individual, for example, scored a low 7.2222 on accuracy and a low 5.74 on attitude. Accuracy was found to be only mini- mally associated with attitudinal scores (r(182) = .2482, p < .0001). Answering a question correctly and completely was 8.1900 7.0000 7.6300 7.6000 8.7500 8.2100 7.7200 8.2300 8.2900 7.8000 5.7400 7.3600 7.7800 7.8700 8.1800 7.0750 8.6900 8.2444 8.6600 7.8342 10.3333 4.35 7.6000 5.45 7.5000 6.975 7.1250 5.65 13.8889 7.88 13.000 4.85 11.8000 6.7 10.8000 6.3 9.6667 4.3 9.5000 7.6 7.2222 2.15 11.8889 3.95 11.2222 6.95 8.6000 8.05 9.7000 5.85 8.5833 4.75 13.4000 7.30 10.2222 8.05 9.6667 8.5 10.1538 not a good predictor of how well the students in this study perceived they were being treated. Conversely, librari- ans who project positive images ,do not necessarily answer questions with the highest accuracy. Minutes spent with stu- dents apparently did affect the attitudinal scores they assigned to the librarians. Librarians who spent 4 or more minutes with students tended to get assigned a higher attitude score than those who spent less time (F = 7.592, p < .00001). In 36 instances, inappropriate refer- rals and sources or outright wrong answers were given. It was thought that perhaps differ- ences might be found between questions of a ready reference nature and those in- volving more extended research. There- fore, the 58 questions were divided into two groups based on the number of sources needed to find the complete and correct answers and the level of diffi- culty of the questions (as perceived by the investigators) to test the prediction that when ready reference questions are asked, the accuracy and attitudinal scores will be higher. However, the type of question did not affect either attitudi- nal scores (t (186) = -.30, p = .768) or accuracy (t (177) = 1.10, p = .271). There- fore, the difficulty of the question did not significantly affect student ratings of ac- curacy or attitude. SIGNIFICANCE AND USE OF THE RESULTS The study was intended as a practical one-to gain information and insights on how reference service at ISU might be improved. While the significance of certain relationships was examined statistically, no formal hypotheses were formulated or tested. This section of the paper, then, deals with the authors' perceptions of the value of the results to ISU and with how these results have been and are being used. As full faculty, ISU librarians are eval- uated each year for the distribution of merit dollars. Three areas of perfor- mance are scrutinized: (1) practice of li- brarianship (the equivalent of teaching performed by general faculty), (2) re- search and scholarly activity, and (3) ser- vice. Librarianship (the most heavily weighted component) may also be the most difficult to evaluate in many in- Evaluating Reference Service 463 stances-especially for public service li- brarians. In evaluating reference activity, impressionistic anecdotes or testimonials from colleagues often replace more objec- tive data. Teaching faculty have tradi- tionally been subjected to regular student evaluations. In a similar fashion, unobtrusive evaluations, such as that re- ported here, furnish a comparable exam- ination of reference performance from several perspectives, accuracy and de- portment among them. Such evaluations allow the quality and character of refer- ence service to be discussed and evalu- ated at a level more concrete than opinion, conjecture, or speculation. In considering the results of this study, a consensus must first be arrived at as to exactly what is an acceptable level of accuracy and of attitude. Is 70% accuracy acceptable? Is 50%? Is an attitude score of 7.8 on a 10-point scale what an insti- tution should be aiming for or should tolerate? What level is unacceptable-7, 6, 5? Is the fact that 15% of the questions are dealt with in less than two minutes significant? That 37% are dealt with in less than four minutes? In making use of the results the librar- ians involved should be made thor- oughly familiar with the methodology of the project and the instrument used. Once the group recognizes that there very well could be problems in the level of service furnished, ideas on how to address them can be solicited, or pre- sented, and discussed in an informal meeting. On one level, simply recogniz- ing that one rna y be perceived in a certain way by a patron, or that two or three minutes may not be an appropriate amount of time to give all questions, or that one may have developed a tendency over the years to point students in the direction of sources rather than lead them, might be enough to solve the prob- lem. With some librarians, the mere fact that they are reminded of possible prob- lems or weaknesses in their performance may be enough to create a self-correcting situation. However, this will not always be the case, and other options should be explored-for example (1) personal in- terviews for the librarians falling at the 464 College & Research Libraries low end of the rating scales; (2) use of outside speakers to present a workshop on improving reference service and com- bating and reducing the effects of burn- out; and (3) identification of the types of questions most likely to be dealt with inadequately. From an unobtrusive study of the type described, improvement in reference ser- vice can be addressed at several levels: personal, divisional, and institutional. If warranted, personal conferences with the librarians can be conducted to discuss, for example, undesirable elements of service. This might be a tendency to use inappropriate reference sources, to con- duct peripheral business at the reference desk, or to give an undesirable impres- sion of one's approachability, friendli- ness, or willingness to help. At this personal level, one can simply run through the list of comments made by the surrogate users and discuss the indi- vidual questions with the librarians. On the divisional or institutional level, the collective consciousness relating to reference service can be heightened by broad, nonconfrontational group discus- sion of patterns detected. Traditional as- sumptions and platitudes about the excellence of service furnished can be chal- lenged and strengths and weaknesses pointed out. Librarians with an accuracy score below some selected level should be consulted privately. The pattern of time spent on questions may be worth discussion with some librarians (one li- brarian spent one minute or less on half September 1991 the questions received and less than three minutes on 80% of them), as would the collection of comments made by observ- ers (about 7 pages for each librarian). Traditional assumptions and plati- tudes about the excellence of service furnished can be challenged and strengths and weaknesses pointed out The third level for discussion would occur at the divisional level. Here, if the assessment of performance showed real excellence, as it did in some instances, it can be commended and serve as a mo- rale builder. If, on the other hand, unde- sirable trends were disclosed (e.g., reluctance to handle questions dealing with a certain collection located on the floor), they should be discussed and exist- ing policy regarding them clarified or re- vised. One unfortunate aspect of providing anonymity in such a project is that, while the identities of the under- achievers are protected, so too are the iden- tities of the stars-the librarians whose performance is truly exemplary and who should be used as role models. After conducting personal interviews, general and divisional meetings, and an in-house developmental institute, the li- brary should implement a similar proj- ect, after an appropriate amount of time has passed, to determine what changes, if any, have occurred as a result of the evaluation process. REFERENCES AND NOTES 1. Terence Crowley, "Half-Right Reference: Is It True?" RQ 25:59-68 (Fall1985). 2. Jo Bell Whitlatch, "Unobtrusive Studies and the Quality of Academic Library Reference Services," College & Research Libraries 50:181-94 (Mar. 1989). 3. Duane E. Webster, "Examining the Broader Domain," Journal of Academic Librarianship 13:79-80 (May 1987); and Joan C. Durrance, "Reference Success: Does the 55 Percent Rule Tell the Whole Story?" Library Journal114:31-36 (Apr. 15, 1989). 4. Ronald Rowe Powell, "Reference Effectiveness: A Review of Research," Library and Information Science Research 6:3-19 (Jan.-Mar. 1984); Crowley, "Half-Right Reference," p.59-68; F. W. Lancaster, If You Want to Evaluate Your Library . .. (Champaign: Univ. of Illinois Graduate School of Library and Information Science, 1988); and Peter Hernon and Charles R. McClure, Unobtrusive Testing and Library Reference Services (Norwood, N.J.: Ablex, 1987). Evaluating Reference Service 465 5. Rolland E. Stevens and Donald G. Davis, Jr., Reference Books in the Social Sciences and Humanities, 4th ed. (Champaign, Ill.: Stipes, 1977); Thomas P. Slavens, Informational Interviews and Questions (Metuchen, N.J.: Scarecrow, 1978); Marcia J. Myers and Jassim M. Jirjees, The Accuracy of Telephone Reference/Information Services in Academic Libraries: Two Studies (Metuchen, N.J.: Scarecrow, 1983); Charles R. McClure and Peter Hernon, Improving the Quality of Reference Service for Government Publications (Chicago: American Library Assn., 1983); and Janine Schmidt, "Reference Performance in College Librar- ies," Australian Academic and Research Libraries 11:87-95 (June 1980). 6. Wyma Jane Hood and Monte James Gittings, Evaluation of Service at the General Reference Desk, University of Oregon Library (Eugene: Univ. of Oregon, 1975), ERIC Document Reproduction Service No. ED 110 038. 7. Thomas Childers, The Effectiveness of Information Service in Public Libraries: Suffolk County. Final Report (Philadelphia: Drexel Univ., School of Library and Information Science, 1978), passim. 8. Charles A. Bunge, "Factors Related to Reference Question Answering Success: The Development of a Data-Gathering Form," RQ 24:482-86 (Summer 1985). "Some librarians decide to consolidate their business with one serials vendor because they appreciate the benefits of dealing with a single representative. We're consolidating as many titles as we can through Faxon for an even more simple reason. I've never asked Faxon for anything I haven't gotten." &:on -DINA GJAMB11 HEAD OF ACQUISITIONS AND SERIALS, KE T STATE UNIVERSITY Helping you manage your world of information. To learn more about the Faxon Company, the international subscription agency with a commitment to quality service, calli (800) 766-0039.