College and Research Libraries TIM SAUER Predicting Book Fund Expenditures: A Statistical Model A statistical model is developed to predict, at any point in the fiscal year, the amount of money that will be spent, within a book acquisition system, on firm orders against a given fund and thereby predict the number of or- ders that must still be placed in order to totally spend but not overspend that fund. ONE OF THE UNRESOLVED annual problems facing many library acquisition departments is how to insure that each acquisition fund is fully spent at the end of each fiscal year. 1 The significance of this problem increases as the buying power of acquisitions budgets continues to shrink. Of the methods used to spend budgets e:l;{actly, neither underspend nor overspend by more than a few percent- age points, some border on the unethical, while others require vast expenditures of staff time . In some of the more fortunate academic libraries, the problem only exists for the total acquisitions budget, since the account- ing for the individual fund allocatioms is at the discretion of the library. Nonetheless, i( the various book funds were allocated on some rational basis, each fund should be to- tally spent at the end of the fiscal year. What is needed is a simple method of providing the capability to predict at any time in the course of the fiscal year what the spent figure for each fund number will be at the end ofY that fiscal year, based on the spent figure to date and the orders that have been placed but not yet received. That is, for each item on order for each fund number it should be possible to predict how Tim Sauer is assistant division head, processing division, McLaughlin Library, University of Guelph , Guelph, Ontario. The author is indebted to S. S. Swaminathan of the University of Guelph Computer Services for assistance with the statis- tical formulations. 474 I many will be received and paid by the end of the fiscal year. What is the model? THE SYSTEM Before a model can be developed, the system must be defined so that an empiri- cally derived statistical table can be built for the history of orders placed, which includes all the possible occurrences to an order. The system will apply only to firm orders, blanket orders and approval plans being excluded. Thus the scope of this model is effectively limited to those institutions where firm orders constitute a significant proportion of their monograph expendi- tures. The model described may be applicable to approval plans where selection is done from agent-supplied approval forms, but since this method receives insignificant use at the University of Guelph library, it is omitted. For the present, serial and con- tinuation orders and . the problem of partial shipments are ignored. What now remains are single-item orders for which the follow- ing may occur at any given time: (a) the ordered item is received; (b) some status report is received for the agent, publisher, etc.; (c) for any of a number of reasons, the order is cancelled; (d) an item is received on the order but is incorrect, a faulty copy, wrong edition, etc. ; (e) the item is not received. The first possibility is the statistically triv- Predicting Book Fund Expenditures I 415 ial case but the whole object of the acquisi- tion process. Assume that the remaining cases can all be reduced, for statistical pur- poses, to two possibilities, namely, the order is cancelled, or the order is not re- ceived. That cases (b) and (d), above, re- duce to cancellation or nonreceipt follows since a status report (b) can result in three actions: 1. The order is cancelled and dropped completely or worked on some more and reordered, when it becomes a new order, reentering the system the same as any other order. 2. The status report is noted and nothing is done. If past statistics included this oc- currence along with other slow items, then the present case is taken care of. 3. The status report takes the order date back to day one and therefore uses the status report date as the order date for cal- culations from then on: An incorrect or faulty item, such as (d), above, can be accepted and a reorder placed for the item, whereupon the reor- dered item is in essence a new order and treated as such; or it can be rejected .and re- turned. In this instance the order is handled by using the return date as the date for cal- culation purposes or ignoring the date and including this time delay in the statistical sample on which the calculations are made. THE MODEL Define: A Event-the order is received B = Event-the order is outstanding C = Event-the order is cancelled For statistical purposes, these three events must be mutually exclusive. That is, the occurrence of one event must preclude the possible occurrence of either of the other two events. The only exception, of the library cancelling an order and the book being received subsequently, is due to an error of communication and is not a breach of the statistical requirement for mutual ex- clusion. The system that is required is one that will provide the probability that a given or- der, which has been on order for a certain number of time units, will be received within a certain number of future time units (i.e., the end of the fiscal year). Define: t Pn (Ait=i) Pn (Bit=i) Pn (Cit=i) time units (days, weeks, months, or whatever) Probability that event A (order received) will hap- pen by age n time units, given that the order is al- ready at age i time units where n>i. Probability that event B (order will not be received) will happen by age n time units, given that the order is already at age i time units where n>i: Probability that event C (order will be cancelled) will happen by age n time units, given that the order is already at age i time units where n>i. Obviously, Pn (Ait=i) + Pn (Bit=i) + Pn (Cit=i) = 1, since the sum of the prob- abilities of an order being cancelled, received, or not received for a given time period must be one. Table 1 is a hypothetically simple history of ten orders. For convenience, call the time units "days." The probability of any one of the orders placed on the first day being received by the end of the fifth day is .8, since two orders are cancelled. The probability of an order, which has not yet been received or cancelled at the end of the third day arriving on the fourth day would be . 5, since there are only four orders out- standing and two arrived the next day. The probability of an order not received or can- celled at the end of the second day being received by the end of the fourth day is 5/7, since there were seven active orders at the end of the second day and three arrived on the third day while two more arrived on the fourth day. TABLE 1 SIMPLE HISTORY OF TEN THEORETICAL ORDERS Time units (Days) A (Received) 1 2 3 4 5 0 2 3 2 1 B (Outstanding) C (CanceUed) 10 0 7 1 4 0 1 1 0 0 476 I College & Research Libraries • November 1978 In general, the equation is given by (1), below: n Pn (Ait=i) = I Ai+ 1 t=i+1 Bi (1) where Bi = number of orders not re- ceived at day i Ai = number of orders received at day i Table 2 is a more complicated example based on the orders placed with one library agent in a three-month period. The choice of analyzing a sample by agent will be dis- cussed later. Since the data were collected manually, the time period used was half- monthly where biweekly would have been better, and all remaining twelve orders were assumed to be cancelled after 9. 5 months. (In fact they were not, although in an automated system they would be at some point.) On the basis of the sample illustrated in table 2, the probability that any or all or- ders currently active with this agent will be received can be calculated for a given date in the future. If the end of the fiscal year were three months away (i.e., six half- months) then the probability that orders five, eight, and ten half-months old will be received before that date is as follows: TABLE 2 BOOK AGENTS SUPPLY HISTORY FROM A THREE-MONTH SAMPLE t ('.2 month) A (received) B (not received) C (cancelled) 0 0 1735 0 1 8 1726 1 2 98 1627 1 3 430 1194 3 4 452 737 5 5 262 468 7 6 123 339 6 7 64 269 6 8 48 213 8 9 44 162 7 10 29 127 6 11 24 96 7 12 22 68 6 13 11 53 4 14 7 43 3 15 2 38 3 16 4 31 3 17 7 20 4 18 4 13 3 19 1 0 12 11 P11 (A/t=5) = I Ai+ 1 t=6 B5 A6+A7+A8+A9+A1o+A11 468 123+64+48+44+29+24 . 468 332 --= .709 468 14 P14 (Ait=8) = I Ai+ 1 t=9 B8 44+29+24+22+ 11 +7 213 16 137 213 P16 (A/t=lO) = I Ai+ 1 t=ll .643 24+22+ 11 +7 +2+4 127 70 =-= .551 127 USES (2) (3) (4) An especially significant calculation is the probability that an order placed today will be received by the end of the fiscal year. This is shown in calculation (5), below, where the fiscal year end is six half-months away. P6 (Ait=O) 6 I Ai+1 t=1 Bo 8+98+430+452+262+ 123 1735 1373 - = .791 1735 (5) By doing this calculation using equation (1) for every order that is still outstanding, multiplying each probability by the respec- tive estimated cost for that item, and then totalling these figures by fund number and adding in the actual spent figure to date for that fund, a probable spent figure (PSF) for PrediCting Book Fund Expenditures I 477 the end of the fiscal year is obtained. The value of this figure is immediately obvious. It tells us the amount of money that must be committed in order to spend the budget to 100 percent by the end of the fiscal year, as shown in equation (6), below: U = Allocation - (PSF + ASF) (6) where U = Probable amount of money not spent (or overspent) at the end of the fiscal year (probable fiscal end balance) PSF = Probable spent figure from adding the probable spent figure for each outstanding order in that allocation ASF = Actual spent figure for that allocation to date The .probable fiscal end balance, if nega- tive, implies that new ordering should stop for the time being; if positive, implies that more orders should be processed. In fact, given the results of equation (5), above, and the average cost of a book for that fund ac- count, the approximate numbers of orders that must be placed immediately to spend that fund completely is given by equation (7): u N = ------------ (7) Pf (Ait=O) x Av where N = number of orders to be placed immediately Pj(Ait=O) = Probability of an order, placed immediately, being received by the end ·of the fiscal year, f time units away. Av = Average cost of a book for that fund number. The usefulness of such figures for all fund numbers would be of great assistance to ac- quisition librarians who are constantly at- tempting to optimize the use of limited staff throughout the year. LIMITATIONS The above approach ignores three types of orders, as mentioned at the beginning of the argument: partial shipments, continua- tion orders, and serial orders. Partial shipments must be a part of the system by virtue of the fact that it is not known that a given order will be a partial shipment until it is received. The part of the order that arrives is taken care of, and the remaining part that is to follow can either be considered as a new order or as a continuation of the old order. Experience seems the best criterion for judging, and provided that the same choice is always used in a given order type grouping, the statistics will reflect this occurrence. · Neither continuation orders nor serial or- ders can be accommodated by the present model. However, a similar model seems possible for both of these types of orders, based on the past received patterns for each order and reducing the total en.cumbered figure for each fund account by some statis- tically sound procedure. Possible models for these areas are being explored. The use of the above method is based on the assumption that the history of receipt of orders in the past will be repeated in the future. Obviously, orders may be grouped by type in order to maximize this possibil- ity. A relatively large number of orders, however, is needed in as short a time pe- riod as possible to minimize changes with time and to keep the group statistically sig- nificant. There are at least six possible fac- t