College and Research Libraries


Research Notes 501 

16. Not found. This is the catch-all category to which an item is consigned whenever it cannot be lo-
cated and no reason is determined. Presumably some items have been lost or stolen, others are in 
transit or in use . After time has been allowed for an item to turn up a decision is made whether to 
attempt to replace iL The gifts and exchanges unit may be able to obtain a free copy, or it may have 
to be purchased. 

Using Time-Series Regression to 
Predict Academic Library Circulations 

Terrence A. Brooks 

Four methods were used to forecast monthly 
circulation totals in 15 midwestern academic li-
braries. In a test of one-month forecasting accu-
racy, the dummy regression method, a sophis-
ticated forecasting method for cyclical data, 
exhibited the smallest average error. In a test of 
six-month forecasting accuracy the monthly 
mean method, a naive forecasting method for 
cyclical data, exhibited the smallest average er-
ror. Straight-line predictive methods, both na-
ive or sophisticated, had significantly greater 
error in both accuracy tests. A remaining re-
search question is, Why do naive forecasting 
methods outperform more sophisticated fore-
casting methods with monthly library circula-
tion data in long-range forecasts? It is sug-
gested that high levels of randomness in 
library-output statistics inhibit the perfor-
mance of sophisticated forecasting methods. 

INTRODUCTION 

A time series is a chronological sequence 
of observations on a variable. 1 An example 
from the field of librarianship of such a 
variable is circulation check-outs. Library 
circulation counts are commonly com-
piled on a daily basis and aggregated into 
monthly or semester reports. A series of 
these monthly counts is a chronological 
sequence of observations on the variable 
library circulation. Consequently, it is fair 
to conclude that the most typical type of 
statistical data libraries produce is time-
series data. Library literature, however, 

reveals little awareness of the ways that 
time-series data can be used. for forecast-
ing and planning. 

Time-series regression techniques are 
regression procedures used to predict fu-
ture values of a time series. They are 
unique only in that they use past values of 
a time series to predict future values of the 
same time series. This paper reports the 
application of two types of time-series re-
gression to the problem of forecasting aca-
demic library circulation. 

FORECASTING 

''In library planning and decision-
making, predictions are invariably re-
quired. ''2 Despite Hamburg's statement, 
there has not been much theoretical work 
or practical application of forecasting 
methodologies to library statistics. This is 
in sharp contrast to the acceptance of fore-
casting in other disciplines. Forecasting, 
or trend analysis, is considered an integral 
part of scientific management and rational 
decision making. Makridakis and Wheel-
wrighe describe forecasting as a tool that 
permits management to shield an organi-
zation from the vagaries of chance events 
and become more methodical in dealing 
with its environment. Like bureaucracies 
everywhere, academic libraries need tools 
that will enhance planning and rational 
decision making. 

Filley and House4 would characterize 
most academic libraries today as third-

Terrence A. Brooks is assistant professor at the School of Library and Information Sci; nce, University of Iowa, 
Iowa City, Iowa 52242. 


502 College & Research Libraries 

stage growth organizations. Large and 
complex, these organizations have devel-
oped beyond the early rapid growth 
stages identified by Filley and House and 
now have become institutionalized with a 
corps of bureaucrats who plan, organize, 
direct, and control. Many academic librar-
ians are similarly charged with the tasks of 
planning, organizing, directing, and con-
trolling library operations. One tool to 
help accomplish these managerial tasks is 
forecasting. 

There are two forecasting studies in li-
brary literature worthy of note. The first is 
by Drake, 5 who considered linear regres-
sion as a predictive technique. She con-
cluded that straight-line trend projections 
are not the most efficient predictors in all 
library situations. The reason is that li-
brary data, especially circulation data, 
show monthly or seasonal fluctuations. 
Cyclicity may be one of the reasons that 
forecasting techniques have had a re-
tarded application to library statistics. Cy-
clicity in library-output statistics means 
that a variable such as monthly circulation 
fluctuates up and down throughout the 
academic year. Such cyclical data demand 
forecasting techniques that can model 
their seasonality. 

The most sophisticated forecasting 
study in library literature to date is by 
Kang. 6 He forecasted the requests for in-
terlibrary loan services received by the illi-
nois Research and Reference Centers from 
1971 through 1978 using several methods, 
including methods that can model cyclical 
data, and found regression to be the best 
predictive technique. He used a weighted 
regression formula that gave less predic-
tive value to older observations, and 
greater weight to the most recent ones. 
The generalizability of Kang' s study is se-
verely limited, however, due to the fact 
that data from only one library was used. 

METHODOLOGY 

The purpose of this paper is to evaluate 
time-series regression forecasting meth-
ods with academic monthly library circu-
lation totals. Time-series regression is a 
methodology that is new to library and in-
formation science, but has been used ex-
tensively in the social sciences, business, 

November 1984 

and economic literatures. 
Makridakis and Wheelwrighe give two 

versions of time-series regreE?sion ap-
proaches. The first time-series regression 
approach uses independent variables that 
are past values of the time series itself. 8 An 
example of such an approach would be us-
ing the monthly circulation totals of sev-
eral months past as the predictor of next 
month's circulation total. This simply 
means that a library's circulation time se-
ries is regressed on itself at a certain time 
lag. There are two caveats with this tech-
nique. First, it produces a straight predic-
tion line and thus should suffer the same 
problem of poor fit that was noted by 
Drake. Second, it is a new application, 
meaning that the choice of time lag has not 
been studied sufficiently with academic li-
brary circulation data. Hence, the choice 
of any particular time lag is completely ar-
bitrary. 

The second time-series regression 
method uses qualitative or dummy vari-
ables. 9 In the context of multiple regres-
sion, a dummy variable is a special inde-
pendent variable that can take only a lim-
ited number of values such as 1 or 0. To 
use dummy regression for forecasting, 
some monthly totals of the time series are 
tagged by a 1, while other months of the 
year are given Os. The result is a multiple 
regression equation that can model the 
seasonal patterns of library circulation to-
tals and should perform as a more efficient 
predictor than straight-line methods. 

To provide benchmarks for perfor-
mance comparisons two averaging meth-
ods were also used as forecasting meth-
ods. These averaging methods were used 
because they represent the most direct 
and naive approach that any academic li-
brarian could use for forecasting. For in-
stance, a future circulation total could sim-
ply be forecast from the average of all past 
values of the time series. Alternatively, a 
particular future monthly total could be 
forecast from an average of past values of 
that particular month. 

In all, four forecastin~ methodologies 
were used with Minitab, a statistical soft-
ware program, and circulation data from 
several libraries: 

1. Dummy time-series regression was 


used to find an equation to predict one 
month and six months in advance for each 
library. This is a sophisticated forecasting 
method that can model cyclical data. 

2. Lagged time-series regression was 
used with each library's data lagged one 
month and lagged six months. The deci-
sion to use a one-month time lag and a six-
month time lag was arbitrary. This is a so-
phisticated forecasting method that 
makes straight-line predictions. It cannot 
model cyclical data. 

3. A simple average was made of each 
library's circulation totals to provide a 
straight-line benchmark for comparison 
purposes. This is a naive straight-line fore-
casting technique. 

4. A monthly average was computed 
for each library for one month and six 
months in advance. This provided a sea-
sonal benchmark for comparison pur-
poses. For instance, if January and June 
represent the forecasts for one and six 
months, then data from previous Januarys 
would be averaged to give a forecast for 
the month ofJanuary. Similarly, previous 
Junes would be averaged to give the June 
forecast. This is a naive forecasting 
method that can model cyclical data. 

DATA 

Research Notes 503 

four million book titles down to a mini-
mum of two hundred thousand book ti-
tles. 

Ten libraries contributed time series of 
60 months' duration, three libraries con-
tributed time series of 72 months' dura-
tion, one contributed 66 months, and one 
contributed a time series of 53 months. 
The most recent six months' data for each 
library were set aside to provide a basis for 
evaluating the performance of each of the 
four forecasting methods. Forecasts were 
made with each method for each of the fif-
teen libraries for one month and six 
months in advance. Each forecasted 
monthly total was then compared to the 
actual total reported by the library and an 
absolute percentage error (APE) was cal-
culated. The average of the APE values for 
each forecasting method (the mean abso-
lute percentage error) was then found. 

An accurate forecasting method would, 
relative to other methods, have a small 
mean absolute percentage error (MAPE) 
across the sample of the fifteen academic 
libraries. An analysis of variance was per-
formed comparing the MAPEs to see if 
there was a statistically significant differ-
ence among the four forecasting methods. 

RESULTS 
A random sample of fifteen academic li- Table 1 shows the results of forecasting 

braries in the Midwest submitted monthly one month in advance. The dummy re-
circulation data for analysis. The states of gression method had the smallest MAPE 
Illinois, Ohio, Michigan and Missouri followed by the monthly mean method. 
were each represented by three academic These methods are capable of modeling 
libraries, Iowa was represented by two ac- the seasonal patterns of academic library 
ademic libraries, and Minnesota by one circulations. The two straight-line predic-
academic library. The holdings of these fif- tion methods followed with the largest 
teen libraries ranged from a maximum of MAPEs. 

TABLE 1 
ANALYSIS OF VARIANCE OF MEAN ABSOLUTE 

PERCENTAGE ERROR FOR ONE-MONTH FORECASTS 

Methods 

Dummy regression 
Monthly mean 
Lag 1 Regression 
Simple mean 
Analysis of Variance 
Source df SS MS F 

Factor 3 3288 1096 3.74 
Error 56 16391 293 
Total 59 19679 

15 
15 
15 
15 

(p= .0160) 

MAPE 
(%) 

12.22 
15.52 
26.26 
30.19 

so 
(%) 

11.08 
12.65 
15.45 
25.48 


504 College & Research Libraries 

An analysis of variance (ANOV A) test 
on the difference among the MAPEs of the 
four methods proved to be statistically sig-
nificant (p=0.0160). Since the null hy-
pothesis of no difference among the popu-
lation MAPEs was rejected, a multiple 
comparison of the _sample means was indi-
cated. The Neuman-Keuls procedure, as 
outlined by Meyer11 was used. A signifi-
cant difference (p < 0.05) was found be-
tween the MAPEs of the dummy regres-
sion and simple mean methods. There 
was insufficient evidence that any other 
pair of means differed significantly. 

Table 2 shows the results of forecasting 
six months in advance. Dummy regres-
sion and the monthly mean methods, the 
two techniques that can model the sea-
sonal patterns of academic library circula-
tions performed better than the straight-
line methods. But the relative positions of 
each technique have changed: the averag-
ing methods now outperformed regres-
sion methods in both the cyclical and 
straight-line cases. 

An ANOV A test on the difference 
among these MAPEs proved to be statisti-
cally significant (p=0.0166). The 
Neuman-Keuls procedure showed that 
the monthly mean method had a signifi-
cantly (p<O.OS) lower MAPE than the 
other three methods. 

DISCUSSION 

The results of this study show the supe-
riority of forecasting methods that can 
model the cyclicity of academic library sta-
tistics. In a test of one-month accuracy, the 

November 1984 

sophisticated dummy regression method 
was superior. In a test of six-month accu-
racy, the naive monthly averaging 
method was superior. 

The outstanding unanswered question 
at this point is why the monthly averaging 
method does so well in long-run forecast-
ing relative to the performance of more so-
phisticated methods. It may be due to the 
fact that sophisticated forecasting meth-
ods are sensitive to random fluctuations in 
library time-series data. Random errors in 
library time series such as monthly circula-
tion totals spring from all manner of hu-
man and mechanical sources; they are 
akin to static interferring with a radio 
transmission. It is a popular theme in li-
brary literature to castigate library-output 
statistics for their lack of reliability and va-
lidity. Childersu even portrays different 
types of library-output statistics as having 
different levels of random error based on 
the method of collection of the statistic. It 
would appear that high levels of random-
ness are preventing sophisticated fore-
casting techniques from modeling library 
circulation data closely and accurately. 
When sensitive methods are used to pre-
dict the future, their forecasts are wider off 
the mark than less sensitive methods. The 
phenomenon of the success of simpler 
methods has been observed in other stud-
ies13 comparing forecasting methods. The 
next step in researching library-output 
statistics would seem to be measuring the 
amount of randomness in library-output 
statistics. 

TABLE2 
ANALYSIS OF VARIANCE OF MEAN ABSOLUTE 

PERCENTAGE ERROR FOR SIX-MONTH FORECASTS 

MAPE 
Methods (%) 

Monthly mean 15 12.38 
Dummy regression 15 15.30 
Simple mean 15 39.16 
Lag 6 Regression 15 39.63 

Analysis of Variance 
Source df SS MS F 

Factor 3 9864 3288 3.71 (p= .0166) 
Error 56 49652 887 
Total 59 59516 

SD 
(%) 

10.71 
13.51 
38.45 
42.08 


Research Notes 505 

REFERENCES 

1. Bruce L. Bowerman and RichardT. O'Connell, Time Series and Forecasting (North Scituate, Mass.: 
Duxbury Press, ·1979). 

2. Morris Hamburg, ''Statistical Methods for Library Management,'' in Ching-chih Chen, ed., Quan-
titative Measurement and Dynamic Library Service (Phoenix, Ariz.: Oryx, 1978), p.38. 

3. Spyros Makridakis and Steven C. Wheelwright, Forecasting Methods and Applications (New York, 
N.Y.: Wiley, 1978). 

4. Alan C. Filley and Robert J. House, Managerial Process and Organizational Behavior (Glenview, lll.: 
Scott, Foresman, 1969), p.441. 

/ 5. Miriam A. Drake, "Forecasting Academic Library Growth," College & Research Libraries 37:53-59 
(Jan. 1976). 

6. J ong Hoa Kang, ''Approaches to Forecasting Demands for Library Network Services'' (Master's 
thesis, Univ. of illinois at Urbana-Champaign, 1979). 

7. Makridakis and Wheelwright, Forecasting Methods. 
8. Ibid., p.81-84. 
9. Ibid., p.217-20. 

10. Thomas A. Ryan, Brian L. Joiner and Barbara F. Ryan, Minitab Reference Manual (University Park, 
Pa .: Statistics Department, The Pennsylvania State Univ., 1982). 

11 . Merle E. Meyer, A Statistical Analysis of Behavior (Belmont, Calif.: Wadsworth, 1976). 
12. Thomas A. Childers, "Statistics that Describe Libraries and Library Service," in Melvin J. Voigt, 

ed., Advances in Librarianship 5 (New York, N.Y.: Academic Press, 1975). 
13. S. Makridakis, A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. 

Parzen, and R. Winkler, ''The Accuracy of Extrapolation (Time Series) Methods: Results of a Fore-
casting Competition," Journal of Forecasting 1:111-53 (Apr.-June 1982). 


Caught in a Draft? 

These Policy Statements 
from ACRL might help~ 

Are ·you faced with the task of drafting or revising a policy statement for your 
library? 

The following collections of policy statements from colleges and small univer-
sities may help you design a document that fits your need. 

Travel Policies of 21 
College and University 
Libraries 

1980, 77p. 

Polici~ for travel and research 
leaves; forms . 

ACRL members , $3; non-members, $3 . 

Academic Status Survey 
1981 , 346p. 

Policies from 31 institutions-
!· community colleges to universities. Personnel 

plans, documents from faculty handbooks 
dealing with compensation, evaluation of 
librarians, promotion and tenure. Includes 
institutions both with and without faculty 
status. 

ACRL members , $12; non-members, $17 . 

CLIP Note #1: 
Performance Appraisal 

1980, 135p. 

Policies and forms from 10 institutions 
for professional and support staff and 
student assistants. 

ACRL members , $7 .50; non-members , $10 . 

CLIP Note #2: 
Collection Development 
Policies 

1981, 131p. 

Full or partial policies from 10 institutions. 

ACRL members, $8 . 75; non-members, $11.50. 

CLIP Note #3: 
Job Descriptions 

1981, 255p. 

Descriptions from 8 institutions 
for professional and support staff. 

ACRL members, $17.50; non-members, $25 . 

CLIP Note #'4: 
Online Bibliographic 
Database Searching 
in College Libraries 

1983, 132p. 

Over 50 institutions contributed policy 
statements, search request forms , billing forms, 
evaluation forms, many others. 

ACRL members, $15; non-members, $19. 

Association of College & Research Libraries 
50 East Huron Street 
Chicago, IL 60611 
(312) 944-6780