DOT-RSPA-DPB-50-79-31 Development of Hybrid Cost Functions from Engineering and Statistical Techniques: The Case of Rail TRANSPORTATION LIBRARY NOV 1 10,90 N 0 R T H W EST £ R N y NIVERSITY DECEMBER1979 FINAL REPORT Under Contract: DOT-OS-70061 Document is available to the U.S. public through The National Technical Information Service, Springfield, Virginia 22161 U.S. DEPARTMENT OF TRANSPORTATION Research & Special Programs Administration Office of University Research Washington, D.C. 20590 NOTICE This document is disseminated under the sponsorship of the Department of Trans¬ portation in the interest of information exchange. The United States Government assumes no liability for its contents or use thereof. Technical K 1. Report No. DOT/RSPA/DPB-50/79/31 2. Government Accession No. 3. Recipient's Catalog No. 4. Title and Subtitle DEVELOPMENT OF HYBRID COST FUNCTIONS FROM ENGINEERING AND STATISTICAL TECHNIQUES: THE CASE OF RAIL - FIRST YEAR FINAL REPORT 5. Report Date JANUARY 1980 6. Performing Organi zation Code 7. Author's) Andrew F. Daughety and Mark A. Turnquist 8. Performing Orgonization Report No. 9. Performing Organization Name and Address Northwestern University The Transportation Center Evanston, IL 60201 10. Work Unit No. (TRAIS) 11. Contract or Grant No. DOT-0S-70061 12. Sponsoring Agency Name and Address U.S. Department of Transportation Research and Special Programs Administration Office of University Research Washington, P.C. 20590 13. Type of Report ond Period Covered First Year Final Report Id. Sponsoring Agency Code DPB-50 15. Supplementary Notes Technical Monitor: Joel Palley, RPD-32 16. Abstract Meaningful economic analysis of public policy and resource allocation in the trans¬ portation industries requires empirical understanding of costs. Two traditional methods of acquiring such knowledge are: 1) engineering techniques based on detailed models of operations, and 2) statistical analysis based on expenditure and output records of firms. The objective of this research is to develop a method for combin- ining these approaches, in order to provide accurate, meaningful models of a cost function for a railroad. The resulting hybrid cost function incorporates detailed information on both operations and non-operations activities so as to provide a more complete picture of the relation of costs to outputs produced and inputs purchased and used by the rail firm. Thus hybrid models reflect costs associated with yard and linehaul activity as well as marketing, planning and other non-operations ele¬ ments of the firm. This links together management, train crews, yard activity, maintenance, financial considerations, etc. as elements of cost generation in the firm. A general methodology for integrating engineering and economic approaches to cost analysis is developed and applied to data collected from a rail firm. A multi- output, short-run variable cost function is estimated and examined. Concepts of economies of size, configuration and density are clarified and used to examine theoretical effects of regulation on firm maintenance policy. 17. Key Words transportation cost modelling railroad cost analysis scale economies railroad operations analysis 18. Distribution Statement Document is available to the U.S. public through the National Technical Informa¬ tion Service, Springfield, Virginia 22161 19. Security Classif. (of this report) UNCLASSIFIED 20. Security Classif. (of this page) UNCLASSIFIED 21- No. of Pages 22. Pr ' ce Form DOT F 1700.7 (8-72) Reproduction of completed page authorized Appniimili Canvitiioai i* Mairie Maaiuiai Symbol Wku Yw Kim Multiply by Tt Flat Symbol LENGTH In in chef •*.* centime tars cm tt loot 20 contimstorf cm *d yards 0.» motors m mi oil loo 14 kilometers km AREA in* square inchos 6.6 square centimeters cm2 ftJ square foot 0.09 squaro motors ro2 square yards O.t square motors m2 mi* square mi loo 24 squaro kilomotors km2 seres 0.4 hectares bo MASS (weight) os ouncot 28 grams 0 Ib pounds 0.46 kilograms kg abort tons 0.9 tonnes t (2000 lb) VOLUME up teaspoons 6 milliliters ml Tbsp tablespoons 16 milliliters ml Il os fluid ounces 30 milliliters ml te cups 0.24 liters 1 P« pints 0.47 liters 1 4« quarts 0.96 litars 1 gel gallons 3.8 liters 1 h* cubic fast 0.03 cubic motors m* yd1 cubic yards 0.76 cubic motors m* TEMPERATURE (txact) •p Faferonhoii 6/9 (aftor Celsius •c toraporaturo subtracting temperature 121 ■| m ■ 244 iflUCliyl. fur oUw e*act conversion» and mmm d»i«ii*d tables, Ma NOS Mite. Putot. 210, Um» ot Nngnu end Ummm, ht» 1246, 80 Cauiog No. CU.lOJli,1 METRIC CONVERSION FACTORS «0 — = te -I 1— te te liabtl Wboa Too Koow Multiply by T« fiteé Symbol -= s te LENGTH ' ' - == Z == S - EE millimeters centimeters meters 0.04 0.4 3.3 inches inches feet i— e» cm m la ft meters kilometers 1.1 04 ■ • I! ~ s— km mi = g g AREA -E H— » square inches mè 0.16 g— •a m* km* square meters square kilometers 14 0.4 square yards square aulas £ —z == ha Imcuim (10,000 of) 2.6 acres —z EE — m -E = MASS (woishl) -E == te 0.036 ounces ox q grama 24 1.1 pounds short tons lb 4 - M = kg t kilograms tonnes (1000 kg) m g— o •e VOLUME 0» -—E == ml milliliters 0.03 fluid ounces fl ox - = m 1 litars 2.1 pints P* «a -E 1 litars litars cubic meters 1.06 0.26 36 quarts gallons cubic tset qi P -5 H— •» i m» Cubic meters 1.3 cubic yards —z g— «e m —z = •o TEMPERATURE (txact J — = = ♦ •c Cslsius 6/6 (than Fahrenheit r temperature add 32) temperature — — ,-T ë= 32 0 140 86-6 80 I 120 •F at °l 1 •e •F -40 160 20" "" rEE a 1 •e S L ou ♦ o 1 W 11' 'i1 '1 -i~1" -20 6 a ti ■< | ft 1 20 |40 37 1 .1 1 • 60 60 100 •c EXECUTIVE SUMMARY INTRODUCTION An understanding of the nature of costs of production is important in every regulated industry, both for individual firms and their regulators. At the most basic level a firm will require cost data for corporate planning. For example, a firm may wish to know what size plant to build, whether to upgrade the quality of plant or whether, at an existing tariff, the revenues for a service cover the incremental cost of providing the service. Regulators and other policy makers also have many reasons to seek improved information about costs. When examined correctly, cost data can be used to determine whether there are in fact economies of scale in production, and whether regulation is a necessary tool of social control in a given industry. Regulators often ask whether a service is being subsidized by other services of a multiproduct firm, is subsidizing other services, and whether the provision of service by one mode will eliminate another mode over a given route. PROBLEM STUDIED Previous railroad cost studies typically have examined a cross section of Class I railroads, using ICC data, and most have assumed a single product, usually total ton-miles. Several aspects of these studies have served to limit the inferences that can be drawn. They rely on data from the ICC accounts rather than on raw data from the firm. With few exceptions, they have specified a relatively simple ES-1 functional form for costs, and assert that the form is appropriate without a test of that assertion. Few adjust for quality of service, and more importantly, many do not account for the multiproduct nature of virtually every rail firm. Finally, they do not attempt to adjust for the fact that some railroads operate with a more complicated network than others. Our own research on railroad transport costs represents a strikingly different approach to the problem for a number of reasons. 1) Our analysis begins at the level of an individual firm, and uses cost and production data obtained directly from the firm rather than from the ICC. This has a number of important advantages, including the avoidance of arbitrary cost allocations of the sort often found in the ICC accounts. We employ a time series analysis for a single firm rather than a cross-sectional analysis for a particular year. 2) The multiproduct nature of the firm is incorporated into the analysis. Output will be characterized both by the volume of freight hauled and by the average speed of a shipment through the system. Models with disaggregated volume (by commodity type) have been estimated, as well as with aggregate data. We explicitly recognize that speed of service is an important determinant of rail costs, and include this in our estimates. ES-2 3) We use information about the underlying tech¬ nological production process, developed through engineering process functions, to better specify the nature of technology and to improve the efficiency of our estimates. In several respects the last point is particularly novel. Historically, most econometric estimates of cost functions have ig¬ nored valuable information generated from an analysis of engineering process functions to provide observations of service-related variables. We have labeled this a "hybrid" approach for that reason, and we believe that important new insights can be gained from applications of this I technique to other modes, as well as in rail transport. RESULTS ACHIEVED A short-run variable cost function incorporating commodity flow information, service characteristics, factor prices and a measure of the plant quality was developed and estimated from data for a railroad. Engineering models of linehaul and yard activity were used to provide information on the average speed of a shipment through the system. The models of the linehaul allowed for grade differences on linehaul sections, track quality, trailing load of the train, amount of available locomotive power and delays due to congestion on single- track sections. The yard activity model predicted expected waiting time in the yard. This'information, along with information on amounts of ES-3 commodities moved, quality of plant and prices of factors such as cars, locomotives, crews, non-crew labor and fuel were used to provide an estimated cost model. Three questions were examined, based on the estimated model: 1) Does engineering model information contribute significantly to model performance and quality? 2) How do the various terms in the cost functions influence predicted short-run costs? 3) Does the data support the use of some of the simpler production models often resorted to in earlier analyses? First, a test of the value of the engineering information was constructed and performed. The result was that the introduction of engineering information significantly improved the model. This acts as a test of the value of a hybrid approach to cost analysis and con¬ firms its superiority to traditional economic or engineering models. Second, the impact on cost of changes in factor prices, commodity flow level, speed and plant quality were examined. The elasticities of cost with respect to factor prices reveal the some¬ what surprising result that the firm cannot easily substitute away from non-crew labor into capital as the non-crew labor price rises. This reflects the need for further computerization of the firm. Increases in speed would reduce short-run variable costs. These would ES-4 largely come about from improvements in linehaul condition, which was also seen via the elasticity of cost with respect to plant quality which was negative. Marginal commodity flow costs were positive, as expected. Third, the model structure admitted testing of various sub-cases such as Cobb-Douglas production technology, which were rejected. Thus, previous studies that have started out assuming such models, are likely to be misspecified and produce biased results. This finding is consistent with very recent literature in the area of cost and production theory and estimation. The project also provided two theoretical results. First, a procedure for properly integrating economic and engineering models was developed. The procedure provides the general form of the cost function to be estimated and classifies the model variables as to whether or not they are to be provided by engineering process models. This is a general procedure which can be used in many other analyses. The second theoretical result concerns a clarification of economies of scale. Economies (and diseconomies) of scale are discussed in terms of economies (and diseconomies) of size, configuration and density. These concepts are defined and used to relate firm-level maintenance decisions to regulatory constraints on service abandonment. ES-5 UTILIZATION OF RESULTS Potential users of the research results include both government agencies and private railroad firms. In fact, officials of railroad "X", which cooperated with us on the empirical work in this project, have already expressed a desire to use our results with respect to marginal cost computations as a basis for submitting a proposed rate. Certainly this is evidence of the immediate applicability of the work to problems faced by rail firms. It is not, however, the only way in which the results could be used by railroads. Significant insights have also been gained with respect to cost elasticities for various input factors. These elasticities have important implications for corporate planning. From a government perspective, this work provides an important step toward operationalizing the concept of "incremental" costs as a basis for policy-making and regulatory proceedings. This concept plays a central role in the regulatory reform legislation currently in Congress. By making the concept operational (specifying data requirements and analysis procedures) regulatory proceedings and policy-making can better incorporate economic principles. CONCLUSION A procedure for integrating economic and engineering approaches to cost analysis was developed and employed to use data from a medium size railroad to estimate a model relating short-run variable costs to commodity flows, speed, factor prices and plant quality. Engineering ES-6 models of yard and linehaul activity were used in a general cost model incorporating financial and engineering data. The data requirements for the study were not unreasonable] almost everything that was needed was already maintained by the firm. The results of the research are useful to both railroad firms and policy makers. The cooperating railroad in this study is presently evaluating the estimation results for possible inclusion in a rate proceeding. Moreover, this study is a step toward providing an operational definition of incremental costs. ES-7 Acknowledgments Major contributions to the inception and development phases of this project were made by Ronald R. Braeutigam (at this writing, with Northwestern University, on leave to California Institute of Technology), Special thanks to the executives and staff of Railroad X who provided access to data and help in understanding what we had stumbled upon. A num¬ ber of them put in no small amount of time digging out and assembling information, without which this project would not have been possible. Thanks also to William Delaney who, while with the Federal Railroad Administration, was our contract monitor, and was without par. We also thank Robert Baesemann, formerly of Northwestern University, who was instrumental in starting the project and Greg Duncan, who spent a short, but productive period beating on data requirements issues with us. Joseph Swanson of Northwestern has been invaluable as a sounding board for ideas and as a source of information on calculating prices of capital goods. The Transportation Center contributed significantly with financial support to augment the original contract from USDOT. Without this help we would still be transcribing data. Thanks to Ken Hurdle of the AAR for providing some very needed data, and especial thanks to Melissa Bartlett, John Gray, Victor Liu and Hossain Poorzahedy, who slaved over much of the data set helping piece it together. Finally, thanks to all those who contributed to the typing of the various papers and parts that have gone into this project, especially Theresa Bonk who plowed through pages upon pages of manuscript, providing her usual excellent typing and editing job. This work was sponsored by the U. S. DOT, Research and Special Programs Administration, Office of University Research under contract D0T-0S-70061. i TABLE OF CONTENTS page 1. INTRODUCTION 1 1.1 Other Railroad Cost Estimates 1 1.2 A Time Series Estimate of a Railroad Cost Function 3 2. HYBRID ANALYSIS 5 2.1 Problems to be Solved 5 2.2 Technological Economies: Size, Configuration and Density Economies of Scale 6 2.3 Hybrid Cost Models and the Use of Engineering Information 11 2.4 Engineering Process Models of Linehaul Train Movement and Yard Operations 17 3. DESCRIPTION OF THE DATA FOR RAILROAD X 46 3.1 General Overview of Data Needs 46 3.2 Flow 47 3.3 Prices of Variable Factors 49 3.4 Fixed Factor Levels 51 3.5 Data Used in the Engineering Analysis 52 3.6 Cost 55 4. THE MODEL TO BE ESTIMATED AND THE ESTIMATION RESULTS 56 4.1 The Model 56 4.2 Estimation Results 58 5. RESULTS AND IMPLICATIONS 64 5.1 Introduction 64 5.2 On the Statistical Evaluation of Results 64 5.3 A Test of the Hybrid Approach 65 5.4 Implications of the Model 65 6. SUMMARY AND DIRECTIONS FOR FURTHER RESEARCH 71 6.1 Summary of First Year's Results 71 6.2 Plans for the Second Year 73 BIBLIOGRAPHY 78 APPENDIX A - A MORE DISAGGREGATE COST FUNCTION A-l APPENDIX B - PRODUCTION AND COST: THEORY AND EXAMPLES B-1 iii LIST OF FIGURES page Figure 1 - ECONOMIES OF SCALE 7 Figure 2 - CLASSICAL LONG-RUN AVERAGE COST 9 Figure 3 - SCALLOPED LONG-RUN AVERAGE COST 9 Figure 4 - CONSTRUCTING A HYBRID PRODUCTION SURFACE 14 Figure 5 - THE RESULTING HYBRID PRODUCTION SURFACE 14 Figure 6 - LINEHAUL MODELS 19 Figure 7 - CLASSIFICATION YARD MODELS 19 Figure 8 - TRACTIVE FORCE AS A FUNCTION OF VELOCITY 21 Figure 9 - EXAMPLE TRACK PROFILE 26 Figure 10 - TWO YARDS CONNECTED BY MAINLINE 29 Figure 11 - EXAMPLES OF ERLANG DISTRIBUTIONS FOR DIFFERENT VALUES OF k 33 Figure 12 - CONFIGURATION AND MAINTENANCE POLICY 69 Figure 13 - DENSITY ECONOMIES AND MAINTENANCE POLICY 69 iv CHAPTER 1 INTRODUCTION An understanding of the nature of costs of production is important in every regulated industry, both for individual firms and their regulators. At the most basic level a firm will require cost data for corporate planning. For example, a firm may wish to know what size plant to build, whether to up¬ grade the quality of plant or whether, at an existing tariff, the revenues for a service cover the incremental cost of providing the service. Cost data may be used to argue for a change in tariffs. A firm may want to know how a change in the level of output of one service affects the costs of providing another service, and it may rely in part on cost data to determine whether it would be profitable to discontinue a service, introduce a new service, or attempt to merge with another firm. Regulators and other policy makers also have many reasons to seek im¬ proved information about costs. When examined correctly, cost data can be used to determine whether there are in fact economies of scale in production, and whether regulation is a necessary tool of social control in a given industry. Regulators often ask whether a service is being subsidized by other services of a multiproduct firm, is subsidizing other services, and whether the provi¬ sion of service by one mode will eliminate another mode over a given route. If regulators are interested in setting tariffs that allocate economic re¬ sources efficiently, they will require information about costs. Generally speaking, then, regulators need cost information to determine how their poli¬ cies will affect market structure and economic performance. These comments apply without exception to the railroad industry. 1.1. Other Railroad Cost Estimates A number of studies have examined costs in the railroad industry. The early work in this area attempted to characterize the output of railroads as 1 a single product, usually ton-miles. These studies typically have examined a cross section of Class I railroads, using ICC data, to test whether there are economies of scale in rail transport. The results have generally been mixed. For example, Klein [50] used 1936 data to find economies of scale that were statistically significant, though modest. On the other hand, estimates by Borts [ 8 ] and Griliches [38] have suggested that, while there may be econo¬ mies of scale for smaller railroads, scale economies are not prevalent for larger Class I railroads. Several aspects of these studies have served to limit the inferences that can be drawn. They rely on data from the ICC accounts rather than on raw data from the firms. They typically specify a relatively simple functional form for costs, and assert that the form is appropriate without a test of that assertion. They do not adjust for quality of service, and more importantly, they do not account for the multiproduct nature of virtually every rail firm. And, they do not attempt to adjust for the fact that some railroads operate with a more complicated network than others. Keeler [49] and Hasenkamp [42] used approaches grounded in production theory to examine multi-product aspects of railroad activities, distinguishing between freight and passenger activities. Using more sophisticated analysis Brown, Caves and Christensen [ 9 ] and Friedlaender and Spady [ 32] develop models that allow multiple outputs and do not enforce spearability of inputs and outputs. Caves, Christensen and Swanson [13] have also used such tech¬ niques to examine productivity growth in U.S. railroads. In all these cases cross-section data drawn from ICC reports or based on Klein's work [50] has been used. Thus railroads with rates-of-return varying between -10% and +40%, facing different geography, having different mixes of equipment, customers and managerial perspectives were mixed together in the estimation process. No 2 real use of service variables such as speed could be used since such data is firm-specific and not usually published. While the above studies have repre¬ sented important advances in understanding of costs, more work is needed, especially at the level of the individual firm. 1.2. A Time Series Estimate of a Hybrid Cost Function Our own research on railroad transport costs represents a strikingly dif¬ ferent approach to the problem for a number of reasons. 1) Our analysis begins at the level of an individual firm, and uses cost and production data obtained directly from the firm rather than from the ICC. This has a num¬ ber of important advantages, including the avoidance of arbitrary cost allocations of the sort often found in the ICC accounts. (For a discussion of the kinds of prob¬ lems arising from the use of ICC data, see, for example, Friedlaender [30], Appendix A.) We employ a time series analysis for a single firm rather than a cross sectional analysis for a particular year. 2) The muZtiproduct nature of the firm is incorporated into the analysis. Output will be characterized both by the volume of freight hauled and by the average speed of a shipment through the system. Models with disaggregated volume (by commodity type) have been estimated, as well as with aggregate data. We explicitly recognize that speed of service is an important determinant of rail costs, and include this in our estimates. 3 3) We use information about the underlying technological production process, developed through engineering process functions, to better specify the nature of technology and to improve the efficiency of our estimates. In several respects the last point is particularly novel. Historically, most econometric estimates of cost functions have ignored valuable infor¬ mation generated from an analysis of engineering process functions to provide observations of service related variables. We have labeled this a "hybrid" approach for that reason, and we believe that important new insights can be gained from applications of this technique to other modes, as well as in rail transport. The plan of this report is as follows. Chapter two represents the extension of the theory of production and cost (reviewed in Appendix B) to the problem of integrating (or hybridizing) economic and engineering approaches. Here we examine concepts of economies of size, configuration and density, a new view of scale economies that comes about from economic/engineering insights gained in the project. These insights also provide the general method for constructing a hybrid cost function developed in the chapter. Finally, the chapter provides a complete analysis of the engineering models to be used. The third chapter provides an overview of the data used while the fourth chapter presents the estimation results (other estimation results on a disaggregate volume model are presented in Appendix A). Chapter five analyzes the results and draws out implications for rail cost analysis. Plans for second year activity on the project are discussed in Chapter 6. 4 CHAPTER 2 HYBRID ANALYSIS 2.1 Problems to be Solved In Appendix B we review the general elements of production and cost theory. The analysis contained therein is for a general firm. When, however, we become more specific in the type of firm that we wish to analyze, we can then refine and expand those notions. In our case we will consider a rail firm, and though most (if not all) of what we develop will be immediately transferable to other modes, we will couch most of our discussion in terms of a railroad. The purpose of this project has been to examine the feasibility and value of linking economic and engineering approaches to cost estimation and apply it in a case study of a railroad; chapters three and four consider the case study. Before proceeding to the case study two fundamental issues must be addressed. 1. Can one be more specific about the relationships between long and short-run cost functions ? What are the natural factors to consider as constant in the short-run and how can one discuss economies of scale? 2. How should engineering information be merged with economic information: is there a way to structure the integration so that a general procedure is developed? These questions are interrelated: the first question raises the issue of what is the short-run cost function and how does it relate to the long-run cost function; we shall see that insights from the engineering perspective provide the clue. The second question concerns the integration or 5 hybridization of engineering information into the economic model of cost: we shall see that insights from the economic perspective provide the pro¬ cedure. 2.2. Technological Economies: Size, Configuration and Density Economies of Scale Economies of scale have been of interest to economists and policy makers for centuries. Adam Smith spent three chapters of his Wealth of Nations dis¬ cussing specialization and the division of labor in production. A number of categorizations and definitions of economies of scale have been developed. Generally, the view was that economies of scale existed if average costs fell as output was expanded. This raises, however, the disturbing problem of what is an average cost in a multi-product firm if we can not refer to any one of the outputs as the output (or if no aggregation function on output exists). The result is that economies-of-scale definitions for multi-product firms now deal with the technological description and not the cost function. Note that this ignores pecuniary economies of scale [72] that are associated with the ability of a large firm to affect the prices at which it purchases inputs. This type of economy is ignored based on the usual presumption of given factor prices (see section B.2), i.e. the firm faces perfectly competitive factor markets. Panzer and Willig [64] provide the following definition of economies of scale. A technology T with associated transformation func¬ tion T(z,x) has economies of scale at (x,z) if there exist r > 1 and 6 > 1 such that T(Arz, Ax) £ 0 for 1 _< A £ S 6 This is a local definition, i.e. for the point (x,z). The exponent r may be a function of x and z. All that is required is that the point (Ax,A z) be in the technology for a small neighborhood of A. Figure 1 shows a tech- 112 2 nology t and two points (x ,z ) and (x ,z ). There are no economies of 11 2 2 scale at (x , z ), while there are economies at (x , z ). It is this definition that provides the measure of scale economies S in section B.2.2.3. Figure 1 - ECONOMIES OF SCALE When we become specific, however, about the application of the theory it is important to differentiate between different mechanisms that give rise to technological economies (as opposed to pecuniary economies) of scale. In this section we will examine, and attempt to relate, concepts of scale econo¬ mies due to size, to configuration and to density. Scale economies due to size come about from the distant and varied geo¬ graphical points that a transport system (such as a railroad or a modern motor common carrier firm) connects. Measures such as average length of haul tend 7 to be used to reflect this type of size. Size, in this sense, is important since it opens many markets to the firm and shippers who must send goods long distances usually prefer to work with as few transport firms as pos¬ sible so as to expedite claims on loss and damage. Large size may, or may not, be accompanied by intensive utilization (or high traffic density) of the system. In between the issues of size and density are the firm's poli¬ cies on configuration and system maintenance. For any system of a given size, there may be many ways to configure the actual system itself. The same set of demand points can be serviced by a minimally connected network (e.g. a tree network wherein each demand point is connected to at most two other points, and for n "demand points there are (n-1) "arcs" connecting the points) a hub-and-spoke network (e.g. a yard connected directly to each demand point) or a completely interconnected system (each demand point connected to each other demand point). There are, of course, many other possible configurations of a system. Changes in con¬ figuration can occasion changes in operating policies (such as blocking policies) and changes in capital utilization (such as the use of cars). Some of the changes will result in economies of scale, some in diseconomies. It is important to note that in general such changes are discrete in nature: there may be severe lumpiness in such changes which may not be smoothable by other input changes. To see this we consider Figures 2 and 3. Here we have assumed one output. Figure 2 shows the classical treatment of the long-run average cost curves (LRAC) as the envelope of the short-run aver- f. age cost curves. The short-run curves are labelled x 1 where this represents different levels of the fixed inputs. Because the fixed inputs are assumed to vary smoothly then LRAC is also smooth. Equation (B.ll) of section B.2,1 8 Average Cost LRAC •> Z Figure 2 - CLASSICAL LONG-RUN AVERAGE COST Average Cost -> Z Figure 3 - SCALLOPED LONG-RUN AVERAGE COST shows the use of the envelope theorem to derive the long-run total cost curve from the short-run curves. If the fixed factors are lumpy then we still have an LRAC curve, but it may be scalloped, as shown in Figure 3. Here there is no intermediate value of x between x H and x (or below x ). The dotted curves 9 represent those portions of the short-run average cost curves that are not part of LRAC. Notice also that if we had incorrectly assumed x^ to be con¬ tinuously variable then our estimated LRAC curve would almost always under¬ estimate long-run average costs, since only the tangencies between the short- and long-run curves would not be underestimated. Clearly, as the possible values of x^ becomes denser (i.e. as x^ can be varied in smaller jumps) then the underestimation becomes less pronounced. Since, however, configuration changes can imply significant land acquisitions or disposals (and other types of lumpy inputs) we expect that such changes are quite discrete in nature and that Figure 3 properly represents the long-run average cost curve. Finally, for a given size and configuration there may be economies of density. Stigler [77] observes that there may be times when inputs would be more fully utilized, but are not, producing excess capacity: "There may be some unavoidable 'excess capacity' of some inputs. A railroad has a tunnel which is essential for given traffic, but can handle twice as much traffic. The emphasis here is on 'un¬ avoidable. ' If the railroad has unused locomo¬ tives, in the long run they can be sold or worn out, and hence do not give rise to increasing returns." (p. 153). In the case of transport systems, especially rail, the density of traffic on the line-haul portion may be low relative to the line-haul capacity. Keeler [49] and Harris [41] have found significant economies of density for U.S. railroads due to excess track capacity (in both these articles, economies of scale are broken down into size and density only). Thus, for example, the elimination of double tracking on some line-haul segment (a change in system configuration) might result in increased traffic density (especially if the 10 original traffic density is light). Density economies can be realized by increases in traffic density for a fixed configuration. Changes in con¬ figuration that result in increasing the density of traffic on a particular piece of track (without increasing the flow through the end-points) would appear to be appropriately labelled configuration economies rather than den¬ sity economies. On the other hand, changes in operating policies that re¬ sult in high traffic density for the same configuration should be associated with economies of density. In summary, then economies of size can come from changes in size hold¬ ing the nature of the output fixed (i.e. we ignore diversification of the firm into other markets than that for transport services). Economies of con¬ figuration can come from changes in system configuration (number of yards, location, interconnectedness, double vs. single line-haul tracking) while economies of density arise for varying traffic levels within a configuration. 2.3. Hybrid Cost Models and the Use of Engineering Information In Appendix B we explain how a cost function C(z,p) is derived from the cost minimization problem: (CMP) min p'x s.t. T(z,x) £ 0 where x is an n-vector of inputs, z an m-vector of outputs, p an n-vector of given prices and T(z,x) a transformation function. It is useful to charac¬ terize the output of transportation firms in general, and railroads in parti¬ cular, in terms of both the physical units of flow over the system (e.g. car- miles of various commodities) and measures of the quality of service pro- 11 vided (e.g. average speed, reliability, loss and damage, etc.) In mathe¬ matical terms, let z = (y,T) where y is an m^-vector of flows and it is an m^-vector of service characteristics with m^ + = m. Let T(z,x) = T(y,ir,x) be of an arbitrary general form, presently unspecified. Most previous cost analyses have fallen into one of two categories. On one hand, economists have developed cost models of firms or industries that have generally ignored models of the physical process associated with the firms activities (an exception is [58])- The strength of the economists's models lay in the recognition that non-operations activities (such as planning, sales, etc.) contributed to output and cost. These things were, to some degree, captured by the economist's model. On the other hand, engineering models of the operations aspect of a firm provide excellent information, but an incomplete picture of the firm. These process models (see [20], [21] for examples in the transportation area) specify certain relationships between inputs (x), flows (y) and characteristics (tt). For example, one of the process models to be presented in section 2.4 relates trailing load (a y-variable) to locomotive horsepower (an x-variable) and speed (a ir-variable). Thus, another way to view the overall production process of the firm is to "layer" physical relationships as constraints on some very general transformation function. The transformation function is the "glue" that holds the system together, including inputs and outputs that are not definable in process func¬ tions. Notationally, we then have the following description of technology. T(y,TT,x) £ 0 (!) g1(y,u,x) <^0 i = 1,...,I (2) where some parts of the y, it and x vectors in the g* functions may not appear 12 at all; i.e., each g function is a process function and as such may not use all the variables. It may also occur that some of the inequalities are really equalities. Suppose we dropped T(y,ir,x) and partitioned the x-vector into subsets, x''", so that x = (x\x^,... jX1), each g^(y,7T,x) was a function of only one element of the Output vector and one subset of the input vector, g1(Zi,xi) =0 i = 1,...,I (3) we would have the case of non-joint production (see [40]). We will not assume this extreme case. Instead, possibly overlapping subvectors of y, tf and x appear in the g1 functions and the entire vectors appear in T. So, for convenience, let H(y,ïï,x) be defined as the set of y, tt and x values that satisfy the joint conditions. We will then write the system as H(y,ïï,x) £ 0 (4) As a simple example, let us consider each vector to have only one element and assume a transformation function: T(Y,n,X) = Y + n + X-l (5) and one process function: g(Y,n,X) = n - .5 . (6) This system is illustrated in Figure 4. The result is the trapazoidal solid in Figure 5 which we will call H(Y,I1,X). Notice two points: 1) H(Y,n,X) is more refined than T(Y,II,X) since we have added the informa¬ tion contained in g(Y,]I,X); 2) H(Y,II,X) is more comprehensive than g(Y,n,X) since we have not neglected relationships between variables not addressed by g(Y,II,X). 13 Figure 4 - CONSTRUCTING A HYBRID PRODUCTION SURFACE 14 Therefore, engineering relationships help us refine a very general structure into a more specific structure, not by making functional assump¬ tions (Cobb-Douglas, CES, etc.) but by the use of physical principles to restrict the possible relationships among the variables. As more and more g1 functions are added, the technology becomes better defined. What does this mean for the analysis at hand? Consider the cost func¬ tion dual to min {p'x| H(y ,tt ,x) 0} , namely C(y,ir,p). We can estimate this cost function by assuming, for example, a flexible functional form that approximates an arbitrary production function. The variables in the func¬ tion would be flows of goods, prices of inputs and characteristics of the service provided. It is the last category especially to which we now turn. The output of a rail firm is stochastic, just like many other firms. This is especially true of the characteristics of service, such as speed. While we would not expect to see significant variation from month-to-month in total aggregate demand (flow), we very well could expect significant var¬ iation in such outputs as speed of a shipment through the system. The effect of this on our model of costs is very significant. Let f(x,co) be a stochas¬ tic production function with random variable co. Thus the output Z: Z = f (x,(o) (7) is now a random variable. Let the distribution of co be G(co) with density function g(co). In [16] it is shown that the profit-maximizing firm will, im¬ plicitly, solve the following cost minimization problem (see section B.2): (CMP) min p'x s. t. E(f (x,(o) ) = u 15 where p is given, u is expected output and the expectation is taken with respect to G on its domain. Thus, the cost function is C(E(Z) ,p) where the expectation here is over the distribution on Z (which can be derived from f(x,a>) and G). If, on the other hand, we had simply formed the cost function on outputs and prices we would have E(C(Z,p)) It can be shown [ 16] that these two cost functions are consistent (for unknown G) if and only if f(x,ûj) is homogeneous of degree one in x (see Appendix B, section B.1.2.1). Since we do not wish to make this an implicit assumption in the analysis, we do not want to use Z as a vari¬ able in our cost function. Rather, we should use a model that predicts the expected value of Z for the observed values of the non-random variables. This is what an engineering process model can give us. Hence, for those stochastic variables in the output vector (y,Tr) we will use a process model to provide the expected value of the variable. Thus engineering models, in fact, provide more useful information than the raw observations themselves. In general, then, we see that the role of engineering models is two-fold: 1. They provide useful information on physical relation¬ ships between the model variables, thereby properly restricting the model of production. 2. They provide the proper variables for inclusion in cost models, especially when some of the output vari¬ ables are stochastic. 16 By using engineering process models to define variables in the cost model we implicitly refine the technology to which the cost model is dual. There¬ fore the result is a more efficient estimate of a better specified cost model. This study has concentrated on speed of a shipment through the system as the engineering input; thus the tt vector has one variable. Speed is a stochastic variable since time of day, of month, season in the year, etc. all contribute to significant variation in the time it takes a shipment to pass through the yards and over the appropriate segments of the line-haul. Therefore the engineering work will concentrate on process models that re¬ late speed to some of the inputs and other outputs. What might we expect to see for a fixed configuration? At low density there is little relationship between speed and short-run variable costs (due to union work rules, only radical changes in speed over reasonably long dis¬ tances would affect costs). At high density, congestion effects act to reduce speed and increase costs. Therefore, one would expect to see a negative rela¬ tionship between speed and short-run variable costs. Clearly, changes in configuration change the density level at which this occurs. These engineer¬ ing relationships illustrate the importance of speed. With this in mind, we turn next to formulating the process models for the hybrid cost function. 2.4. Engineering Process Models of Linehaul Train Movement and Yard Operations 2.4.1. Introduction A basic premise of this research project is that models representing the basic engineering processes involved in railroad operations can contri¬ bute significantly to the estimation of rail cost functions. The integra¬ tion of these process models with econometric estimation of cost functions results in what may be termed a "hybrid" model. This section describes in some detail the set of process models used in the project. 17 The models cover three important areas of railroad operations: 1) line-haul movement of trains; 2) line-haul delays due to interactions among trains; 3) classification yard operations. The models presented here draw heavily on the work of previous research¬ ers. The major contribution of the current project is synthesis of these component models into a workable system for use in cost function estimation. The types of models involved, and their interactions, are illustrated schematically in figures 6 and 7. The chapter is organized into three major parts, describing the three models separately. 2. 4.2. Line-Haul Train Movement The line-haul train movement model developed for this project is based on six fundamental characteristics to insure flexibility in application. These characteristics are as follows: 1. The basic form of the model is derived from physical relation¬ ships, with corrections for technical conditions separated from the primary function. This allows adjustment for further technical change without modification of the basic form. 2. The model is applicable over a wide range of train types and physical line configurations. 3. It uses a minimum number of variables. 4. The model is designed to use variables defined so as to be commensurate with expected data available (e.g. trailing load, rather than individual car weight statistics.) 5. The variables can be readily related to specific cost data. 6. It provides a base which is adaptable to experimentation with differing types of operational and investment policies. 18 Train 1 Trailing Load Locomotive Train i Trailing Load Figure 6. LINE-HAUL MODELS, Delay time V Delay time Yard delay Figure 7. CLASSIFICATION YARD MODELS 19 The first requirement of the model is to establish a relationship among dis¬ tance, time, weight of the train, and horsepower provided. Obviously time and distance can be further reduced in part to the velocity over each track segment. The analysis is based on establishing the physical relationship at a constant velocity. For simplicity, acceleration and deceleration will be ignored. Thus F = m ft = = ° where: F = force exerted by the train system m = mass dv/dt = acceleration There are two basic forces at work on the train — the tractive force exerted by the locomotive and the resistance of the train mass acting in the opposite direction. Thus the basic balance equation is: F = F - F =0 t *R (9) where: Ffc = tractive force of the locomotive F = resistance force of the train K. Tractive force of a locomotive is a function of its horsepower and the velo¬ city at which it is operated. Locomotive manufacturers publish graphs of velocity versus tractive effort for all of their locomotive types. (See, for example, [26].) These curves are essentially rectangular hyperbolas within the velocity range of interest (10 mph - 65 mph). Exceptions to this rule have the form shown in figure 8 and fit the normal form to the right of point a. 20 a Figure 8 - TRACTIVE EFFORT AS A FUNCTION OF VELOCITY The value of a is usually no higher than 15 mph. For all locomotives Ft can be defined as follows: Ft = 375 HP (e/v) (10) where: HP = locomotive horsepower e = machine efficiency, (.825 for most North American applications) v = velocity (miles per hour) F = tractive force (pounds) This equation describes the relationship between F and v indicated by manu¬ facturer data to within about 2% for both types of curves. This indicates that curves of the type shown in figure 8 reflect distortions in the lower velocity ranges rather than basic functional changes. Thus, the form of F shown will be used for all locomotives operating in line haul service. a The resistive force, F , may be broken into two components: F , the R R c resistance to motion by the locomotive, and FD the resistance to motion by R 21 the cars. The equation used to represent both of these quantities is the time-honored Davis Formula. A complete discussion of its component parts may be found in references [17], [18], and [43]. For the present purposes, it is sufficient to note that it is based on considerable empirical evidence gathered over the last half century, and it has been shown to provide consis¬ tently useful results. It is based on a resultant resistance determined by summing the components of journal, flange, air, track, and grade resistance with appropriate constants. The form used in this work is that presented in Hennes and Ekse [45] which is taken from Davis's earlier work [ 18]: E * ¥2 + — + JV + — + 20s ww wn where: R = resistance to movement, in pounds per ton of train weight w = average weight per axle in tons V = speed, in miles per hour n = number of axles per item of equipment J,K = constants /for locomotives: J = 0.03, K = 0.0024 \ \for cars: J = 0.045,K = 0.0005 / A = cross sectional area, in ft? I for locomotives: A = 120 \ \for cars: A = 87.5J s = grade encountered, in % . In the case of locomotives, a value of 32 can be used for w. A check of mainline locomotive weights since the introduction of the General Motors FT diesel in 1939 shows that axle loadings have remained remarkably consis¬ tent since that time [H]. While some roads own heavy duty locomotives with axle loads as high as 35 tons and a few older units are as light as 30 tons, 22 this variation is of little consequence (less than 1% in most cases) when viewed against the total tractive effort exerted by the locomotive. The 32 ton weight is also highly representative of the major builders standard models as supplied to railroads not requiring extensive extra features. The number of axles, n, is the total for the locomotive, not the number on each unit. Likewise, in the horsepower-tractive effort equation the horsepower is the total for a locomotive, not that of individual units. Thus, after substitution of constants and performing the implied multi¬ plications, the total resistance for locomotives is as given in equation 12. 2 Fr = [2.05 + .03V + + 20s] 32n (12) = 65.60n + .96Vn + . 29V^ + 640sn The car resistance function is a bit more complex. The additional problems are caused by the inability to find a constant value for w. However, it is possible to designate two constant w's (one for loads, one for empties) for a particular railroad's circumstances. This can be shown in the following way. Examination of the resistance function indicates that three of the five terms are inversely related to the weight of the equipment. Thus an empty car will have a higher resistance per ton than will one which is loaded. This reveals the physical reason behind a phenomena which is well known in railroading: a given locomotive can pull a heavier train if it is composed primarily of loaded rather than empty cars. This is reflected in the policy which a railroad adopts in assigning tonnage ratings to its locomotives. If it rates its locomotives for the 80 to 100 ton cars which are becoming prevalent, or for a higher percentage of loaded rather than empty cars, then it will assign relatively high tonnages. On the other hand, if it bases its 23 ratings on 50 to 70 ton cars, or a low percentage of loads, then lower capabilities will be assumed. In many cases the lighter cars are assumed as the more stringent conditions represent a more conservative view and also allow for other factors, such as adverse weather conditions, recovery of lost time when delays occur on the road and the fact that many cars are not loaded to capacity [26]. In view of this, locomotive tonnage tables published by General Motors for its locomotives [26] are in terms of 50-ton cars with an assumed light weight of 20 tons and 50% empty cars. Thus the general form of car resistance is (assuming n=4 in all cases): FrSC= [ ^y/2 + + .045V + •°109V + 20s ] 4w w w w e e e e 1/2 ? - 37.6wg + 50 + .18w^ + ,0436V + 80W s (13) an^ Fr = 37.6wj ^ + 50 + ,18Vw^+ .0436V2 + 80w^s (14) for empty and loaded cars respectively, where w^ and w^ are the appropriate axle weights. If the conservative assumptions outlined above are used, these two equations become: pp 2 FDec = 149.22 + 1.26V + .043V + 560s (15) K F fc = 216.04 + 3.51V + .0436V2 + 1560s (16) As noted earlier, the effects of loaded cars should be considered. On most railroads, tonnage ratings are based on the assumption that 50% of the cars in a train will be empty. This even split makes it possible to consider resistance in terms of the sum of the resistance of one loaded 6C f C and one empty car taken together. This quantity (F + FR ) will be denoted a "car unit." Thus the trailing load for a particular locomotive can be expressed as the net tractive effort (total tractive effort minus locomotive 24 resistance) divided by the resistance of a car unit with the resultant multiplied by the weight of a car unit. Obviously, the proportion of loads to empties may be changed by appropriately weighting the two resistance and weight quantities of the car unit. Thus, trailing load is: F — F TL = [ Ac + Ac ] 4(We + V- (17) r T r R R When expanded this becomes: 309HP/V - (65.6n + ,96Vn + .29V2 + 640sn) TL = l 272 1/2 2 ^ 4 ^We+ Wf ^ ' 37.6(w ' + w/' )+.18V(w + wr)+.037V + 80s(w + w_)+100 et et et (13) If standard values of w =7 and w. = 19.5 are assumed, as outlined e t previously, equation (18) becomes: TL = 1Q6 j- 309HP/V - (65.6n + .96Vn + .29V2 + 640sn) j (364.81 + 4.77V + .087V2 + 2120s) Other values of w and w, may be substituted depending on the policy on et which tonnage ratings are based. This function is stated in terms of only five quantities; velocity, locomotive horsepower, number of locomotive axles, tonnage trailing the locomotive, and gradient. All of these items represent readily available data for any particular railroad. 2.4.3 Application of the Model to Determine Running Time To determine the running time for a given train between two yards, equa¬ tion (19) would be applied in two states. First, given the trailing load and the most restrictive conditions the train will encounter (usually termed the ruling grade), this relationship can be used to determine the horsepower which must be assigned to the train. Then, given the horsepower and trail- 25 ing load, the resulting velocity for any other track segment can be deter¬ mined. Once this has been done for all segments, the total running time for the entire route can be computed. Details of this procedure are illus¬ trated by the following example. Consider a line whose profile is as shown in Figure 9 . The total line is 50 miles long, and can be divided into four segments. The first seg¬ ment is 20 miles long with no grade; the second segment is the ruling grade, with a 1.0% grade over a 5 mile section; the third segment is 10 miles long, with a 0.5% grade; and the last section is 15 miles of level track. o E Assume that the trailing load is 5000 tons, and that we wish to main¬ tain a velocity of 20 miles per hour on the ruling grade. Let us also assume that the locomotives available are of the four-axle variety. The first step is to find the required horsepower for the train. Equa¬ tion (19) can be rewritten as follows to solve for the value of HP: HP = [(0.11+ .065s)V+ .00015V2 + (2.66X10"6)V3]TL+ [ ( . 21 + 2.07s) V + .003V2]n + .0009V3 . (2°) Inserting values of s = 1.0, V = 20 and TL = 5000, we obtain the required horsepower to be HP = 8200 26 In any given situation, we may not be able to assign exactly this amount of horsepower to the train. The actual assignment would reflect the numbers and sizes of locomotives available. For example, if the available locomo¬ tives are all 3000 horsepower units, the assignment would be three of these units, with a total of 9000 horsepower. For the purposes of this example application of the model, we will assume that exactly 8200 horsepower are assigned. Thus, we know that the velocity on segment BC will be 20 miles per hour. The second stage of the analysis involves finding the velocities on the remaining track sections. For this purpose equation (19) can be written as a cubic equation in V, as follows: C(2.55 x 10-6)TL + .0009]v3 + (.003n + .00015TL)V2 + [(.011 + .065s)TL + (.21 +2107s)n]v - HP = 0 (21) The roots of this equation can be found using standard formulae. For the values TL = 5000, n = 4, s =0, HP = 8200, this equation has one real root and two conjugate imaginary roots. The real root is approximately V =74. Thus, on the level track segments, the predicted attainable velocity is 74 miles per hour. In practice however, due to gearing, the locomotives may not actually be able to run this fast. In addition, there may be speed restric¬ tions on the line, so that the expected velocity of the moving train would be given by: Va = min[Vt,Vg,VTL] (22) where: = expected velocity V = speed restriction on line imposed by timetable V = maximum achievable locomotive speed g V = speed attainable with given trailing load . 1L 27 For the purposes of this example, let us assume a speed limit of 60 miles per hour along the line, so = 60 for the level segments, AB and DE. For the segment CD, we insert values S = 0.5, TL = 5000, n = 4, HP = 8200 into equation (21). Once again, there is one real root and two imaginary roots. The real root is approximately V = 50. Table 1 summarizes the results of the computations, and indicates the overall predicted running time. Table 1. Summary results for example. Segment Length (mi.) VTL (mph) (mph) Running Time (hrs.) AB 20 74 60 0.33 BC 5 20 20 0.25 CD 10 50 50 0.20 DE 15 74 60 0.25 TOTAL 50 1.03 2.4.4. Modeling Delays Enroute Total origin-destination time for a train is not generally composed of running time alone. In fact, there is always some pre-departure time at the origin yard and some post-arrival time at the destination which must be recognized. However, in addition, there are often delays enroute due to switching and/or interactions (meets or passes) between trains. Trains are often delayed enroute due to passing or being passed by other trains going in the same direction, or on single track mainline, 28 meeting trains going in the opposite direction. Detailed simulation models are often used by railroads to evaluate train congestion. However, for the purposes of this project, it is desirable to have a simpler, analytic model which can be incorporated more readily into the specification of a produc¬ tion function for cost estimation. The model proposed here draws heavily on work done by Petersen [66]. It is also similar to methods of analysis for low density highway traffic with passing delays (see, for example, Haight [39]). The situation to be considered is illustrated in Figure 10, showing two yards, A and B, connected by a main line track segment which may be either single or double track. Classes of trains in each direction will be defined by different aver¬ age running times (or speeds). In practice, of course, each train will have a somewhat different running time, as determined by the model in the pre¬ vious section. Some aggregation will normally take place, since we are interested in identifying classes of trains. For example, we may have local freights at an average of 20 miles per hour, regular through freights averag¬ ing 40 miles per hour, and special high priority trains averaging 60 miles per hour. If we consider ourselves to be located at A, we will assume that there are K different inbound train (speed) classes, and L different outbound Figure 10. TWO YARDS CONNECTED BY MAINLINE. 29 classes. We will also adopt the convention that outbound speeds are nega tive, for algebraic convenience. Define M . as the expected number of encounters (meets or overtakes) 13 between a single train of class i and all trains of class j, on its trip between yards. If we assume that each encounter results in an average de¬ lay, D.j, to train i, we can write the expression for average transit time, including delays, as follows: W. = R. + S. + I D..M.. (23) x 1 x ^ X] X] where W = average total transit time for train class i = average running time for train class i =average delays enroute from all other occurences. The values for R. are determined from the line-haul train movement model des- x cribed in Section 2.4.2. In order to utilize the model in equation (23 ) . the expected number of encounters between trains, , must be expressed in terms of quantities avail¬ able. These include traffic density of trains of different classes, their speeds, and dispatching policies through time. As an example of the deriva¬ tion, let us consider the expected number of meets between trains going in opposite directions. If an outbound train of speed i leaves at time t=0, it will encounter inbound trains already on the line at t=0, and those dispatched before the outbound train arrives at the other end, at time t=W^. For inbound trains of speed j, this will include all trains dispatched between t= -W and t=W . J i If we assume, as Petersen did, that train departures from either end of the line are independent and uniformly distributed through time, the 30 expected number of meets is M. . = N.(W. + W.) (24) ij 3 1 3 where = rate of dispatching of train class j (trains/unit time). In a similar manner, we can derive the result that train i will over¬ take trains of a slower class, j, that depart between t = -(W^ - VL) and t=0. Furthermore, train i will be overtaken by faster trains, j, which depart between t=0 and t = W. - W.. If we then let I be the set of inbound 13 train classes, 0^ of the set of outbound train classes of higher speed than train i and 0 the set of outbound train classes of lower speed than i, we s can write equation (23) as Since we have an equation of this type for each speed class, both in¬ bound and outbound, this defines a set of K+L simultaneous linear equations that can be solved for the K+L unknowns, W^. Of course, if the line under study is double track the term delay due to meets vanishes. The model described above is essentially that developed by Petersen [66], Several extensions of this basic model are possible. English [27] has made modifications so that it reflects operations on high density lines more accurately. These modifications account for multiple meets and delays induced by signal systems in very high density operations. For lower density operations typical of most lines, an extension is possible to account for 31 the fact that trains are often not dispatched at random times with a con¬ stant mean interdispatch time throughout the day. Such a modification is described in the following section. 2,4.4.1 Extension to Other Dispatching Strategies The model described above assumes that trains are dispatched indepen¬ dently, at random times with a constant rate throughout the day. The impli¬ cation of this assumption is that times between successive trains will be exponentially distributed. Thus the probability density function is given by f(t): f(t) = Ae"Xt, t > 0 (26) where t = interdispatch time À = average rate of dispatching (=l/mean time between trains). In many situations however, train dispatches can be scheduled somewhat more regularly, and line-haul delays due to meets can be reduced. In such cases, the times between trains must be characterized by a more general probability distribution. A useful generalization is to characterize these times by the Erlang-k distribution. By varying the value of k, a wide range of distributions can be represented. When k=l, this distribution is equi¬ valent to the exponential model. As k increases, the variance of the dis¬ tribution decreases, reflecting more regular dispatches. In the limit, as k 00 , the distribution becomes a spike at a given value, reflecting con¬ stant times between dispatches of trains. A few members of this family are illustrated in Figure 11. 32 The general form of the probability density function for an Erlang-k random variable is: , ,, nk-1 -At f(t) = (k-1)! ' (2?) If interdispatch times of train class j are distributed Erlang-k, we can derive the expected number of encounters of a train of class i with all trains of class j as follows. If we denote the number of dispatches in a given period as a random variable , the probability of observing X dis¬ patches in a period of length W is: k-1 -AW,,IT.Xk+i P(Y - X) - I' (jhki • (28) 1=0 This result arises from considering an Erlang-k random variable to be a sum of k exponential random variables with common parameter, A. Thus, the prob¬ ability of observing exactly X occurrences of the event described by the Erlang distribution is the probability of observing between kX and k(X+l)-l fundamental exponentially distributed events. This is the probability repre¬ sented by the sum in equation (28). Equation (28) defines the probability mass function for the number of encounters of a single train i with all trains of class j. The expected num¬ ber of these encounters is then X=0 If we define a function p(z,t) as E(Y.) = Y XP(Y. = X) y vtn J (29) °° e~tti p(z,t) = I 6 lt (30) i=z 34 equation (28) can be rewritten as follows: P(Yj = X) = p(Xk,XW) - p[(X+l)k, XW] . (31) Equation (29) then becomes CO 00 E(Y ) = I p(k,XW) - I Xp[(X+l)k,XW] J X=0 X=0 CO = I p(Xk,XW) . (32) X=1 Values of the function p(z,t) are tabled (see, for example, Molinas [61].) The value for E(Y_.) can be substituted into equation (23) to replace the expression for given in equation (24) . The basic delay model thus becomes W = R + S + I D I p[Xk, (W + W )] . (33) j J X=1 3 Equation (33) defines a set of K+L non-linear equations in the K+L unknowns, W^.. These must be solved using iterative solution techniques, but they can be used to provide a more general solution for line-haul delays. 2.4.5. Models of Yard Activity According to data gathered by Reebie Associates [69] the average rail car spends only 16% of its time actually moving in trains. An additional 56% is spent in classification yards. This underscores the importance of 35 representing classification yard activities if we are to reflect railroad operations with any reasonable degree of accuracy. While in a railyard, a car undergoes four basic operations: 1) inbound inspection 2) classification 3) assembly into outbound train 4) outbound inspection. It is quite natural to think of these as a series of queues through which the rail car passes. This perspective is adopted here. Inbound and outbound inspections consume a relatively small amount of time for each car, and the amounts of time required are not highly variable. For these reasons, they are not analyzed in detail here. However, explicit queuing models have been constructed for the remaining elements: classifica¬ tion and assembly. Average time in the yard is predicted as the sum of delays due to classification and assembly, as shown in equation (34): T = T + T (34) yea w/ where = average time in yard Tc = average delay for classification Ta = average connection delay before assembly into outbound train. 2.4. 5- 1 Classification Delay There are a number of different queuing models which could be suggested for the classification operation. The major previous work along these lines has been done by Petersen [67]. 36 He suggests several possible models, including: M/G/l: Poisson arrivals of cars on trains, a general service time distribution, and one server; M/M/s: Poisson arrivals, exponential service times, and s servers; M/D/s: Poisson arrivals, deterministic (constant) ser¬ vice times, and s servers. It should be noted that Petersen considers the basic units of arrival to the system to be trains, not individual cars, and thus he derives para¬ meters for service time to classify an entire inbound train. While this sim¬ plifies representation of some elements of the system, it leads to some con¬ fusion about the relationship of the output process of one queue to the input of another. For this reason, the models developed here are based on individual railcars. As a result, we should recognize the fact that individual railcars arrive in batches on trains. This fact dictates use of a more general batch arrival model, denoted as: (X) M /G/1: Poisson arrivals in batches of size, X; arbitrary service times; and 1 server. In this case, X is a random variable corresponding to train length. A solution for such a model, yielding average delay time, has been developed by Gaver [34 ]. A concise summary of the results is available in Saaty [71 ]. Average delay (time in queue plus service) is given as follows: 37 T c 2(1-p) + 1 / (35) where 6^ = average train length (cars) 6 £ - second moment of train length p = Àô^/u = traffic intensity of system A = arrival rate of trains (trains/hr.) p = average service rate (cars/hr.) 2 a = variance of service time distribution, The distribution of service times for classifying cars depends greatly on the physical layout and operating plan of a particular yard. Probably the most important distinction is between hump yards and flat yards. In hump yards, the classification service is quite straight-forward. A switch engine pushes the string of cars over the hump at essentially a constant speed (generally 1.5- 2.0 miles per hour). As each car reaches the hump crest, it is decoupled and rolls down into the classification bowl. The only variations in time per car are due to variations in length of indivi¬ dual cars. Such variations are relatively minor, and a deterministic service time distribution is an appropriate model. In this case, = 0 (X) in equation ( 35), and the model can be denoted M / D / 1. Models for flat yards are somewhat more complex, since the switching operations for classifying trains are not as simple as hump operation. An inbound train comprises a set of cuts (groups of cars with common origin and destination that move together through the yard) that are to be sorted into outbound "blocks" on classification tracks. Each cut will be switched as a unit, and if successive cuts are to be placed in the same block, one switching operation can handle multiple cuts. Thus, rigorous derivation of 38 a service time distribution for individual cars would require incorporation of the distribution of number of cars per cut and the likelihood of succes¬ sive cuts having common block designations, as well as the distribution of time to complete a particular classification switch. Since good empirical data were not available on all of these character¬ istics, our approach has been somewhat less detailed. A sample of flat switching operations were observed in one yard, and the total time required and number of cars switched were recorded. For each of these observations an equivalent "minutes/car" value was then computed. Finally, a gamma distribution was fit to these values. The probability density function of a gamma distribu¬ tion with parameters a and 3 is as follows: f(x) = xa 1 e 0 <_ x < <» . (36) The mean value of the gamma random variable, x, is a/3 and the variance is a/32. Maximum likelihood estimates of a and 3 were computed as 1.3 and .28, respectively, using the observed data. This corresponds to a mean switching 2 time of 4.6 minutes/car, and a variance of 16.6 minutes /car. Previously reported estimates for average switch-engine-minutes/car have varied widely, depending upon the number of cars per cut and the degree of congestion pre¬ sent in the yard facility. For example, Wright [84] reported estimates of 3.2 minutes/car for single car cuts, but average values below 2 minutes/car for multiple car cuts. Martland and Rennicke [59] report average values from 3 to 10 minutes/car for different levels of workload in two yards on the Boston and Maine. Thus, it appears that our estimate is well within the range of plausible values. 39 Values for variance of the switching time have not been widely reported, so it is difficult to verify our estimates based on previous results. How¬ ever, the wide range in previously reported average values tends to support the contention that the process is highly variable. Thus, our finding that the service time process is nearly exponential is not surprising (note: an exponential service distribution would have a =1.0). In general, it appears that our estimates of parameters of the service time distribution for flat switching are quite consistent with earlier reported results. As an example of the values produced by the model, data from one yard studied show an average arrival rate of 5.33 trains/day, with an average length of 45.2 cars and a second moment of length equal to 2876. Combined with the estimated service time parameters, these data result in an estimated utiliza¬ tion rate of .77. Substituting values into equation (35), we obtain a mean delay for classification of 8.2 hours. Available data for the yard under study did not include detail on time spent waiting for classification, connection delay, etc. As a result, it is difficult to verify this model directly. However, the predicted delay of 8.2 hours is well within the range of observed data presented by Folk [28], Beckmann, et al [3], and Gentzel [35] for various terminal facilities. Values for mean classification delay between 4.6 and 22.4 hours have been re¬ ported by these authors for different yards at different times. 2<4.5.2 Assembly into Outbound Trains Once cars have been classified, they must wait for assembly and dispatch of the appropriate outbound train before they leave the yard. Operationally. 40 we can think of this process as being one in which cars arrive on the classi¬ fication tracks, either singly or in small groups (cuts), and wait for the designated outbound train to be "called." At this point, all the cars for this train are assembled, and when made up, the train departs. In terms of a queuing model, we may think of this as a batch-service system in which the "server" is the outbound train. Service for a batch of cars begins when the appropriate outbound train is called for assembly, and the service time is the time between successive outbound trains on which a given cut of cars may be dispatched. The delay time for connection with the outbound train is then the waiting time in queue derived from such a queuing model. It should be noted that this perspective on modeling the system places principal emphasis on the outbound train schedule as the source of delay for cars following classification. Delays in assembly due to insufficient num¬ bers of switch engines and crews are not considered directly. This effect is only represented indirectly, in terms of late departures of outbound trains, for example. The emphasis on schedule is in keeping with the findings of several previous researchers, and has been recognized by a rail industry task force on reliability studies [29]. The average delay for a simple batch-service queue of this type can be derived easily. Let us assume that individual cars arrive randomly in time (i.e. as a Poisson process) from the classification operation, and that the outbound train takes all cars available at the time it is assembled. This second assumption means that train length constraints on the outbound trains are ignored, for the time being. We will return to this issue follow¬ ing the basic derivation. 41 Define a probability density function, g(t), 0 <_ t < 00 , which des¬ cribes the distribution of time intervals between successive outbound trains for a given block of cars. If cars arrive randomly in time on the classifi¬ cation tracks, and the interval between two trains is a particular value, tQ, the average delay time will be Ta(to> = T * (37) The expected number of cars arriving during any interval is proportional to the length of that interval. That is, the expected number of cars arriving in an interval of length tQ is n(to) = k,to (38) where k is the arrival rate. The total expected delay time for all cars in an interval of length t will then be S(t ) = n(t )• T (t ) . (39) o o a o The unconditional total expected wait time may be obtained by integrat¬ ing over the density function, g(t): s = / S(t)g(t)dt . (40) o In like fashion, the unconditional expected number of cars in an interval is 00 n = / n(t)g(t)dt . (41) o Finally, the unconditional expected delay for cars is simply 42 2 J t g(t)dt E(t2) T = - = ^ = . (42) n k / t g(t)dt 2E(t) o If desired, equation (42) may be rewritten as 2 T = + —-— (43) a 2 2E(t) V ' 2 where cr_ is the variance in the time interval between successive departures. 2 Note that if departures are completely regular (a = 0) , the second term vanishes, and the expected delay is one-half the interval between trains (e.g., 12 hours for trains dispatched once per day.) On the other hand, if dispatches occur very irregularly, the second term indicates that expected delay to cars will increase. Equation (42) is analagous to a result widely used in studies of urban mass transit systems, expressing the mean waiting time of passengers at a bus stop. Derivations of the result in the mass transit context can be found in Welding [82], Osuna and Newell [631 or Kulash [51]. The derivation of equation (42) assumed that outbound train length is unlimited, or in queuing terms, that that batch size is infinite. In prac¬ tical terms, this assumption is not really true, since there are limits to the length of train which can be dispatched. Such limits can be the result of mainline track configuration, power availability, etc. More sophisticated batch-service queuing models can be constructed to reflect these constraints, but for batch sizes in excess of 25-30, the numerical results are essentially the same as for infinite batch size. (See Petersen [65].) Since train length constraints would typically be well in excess of these values, use of a simpler, infinite-batch-size model is appropriate. 43 Using the queuing models for classification delay and connection/ assembly delay described in the previous two sections, we can predict total delay in the yard by simply summing the results from equations (35) and (42), as indicated in equation (34). 2.4.6' Estimation of Average Shipment Velocity Together, the line-haul train movement and classification yard models provide the means for estimating the average speed of a shipment through the system. This value will be used as a single index of service quality in the cost model to be estimated. From the running time and delay models described in section 2.4.2 and 2.4.4 average transit time for trains over each mainline track segment can be com¬ puted. Since we know the length of each segment, this can be converted to an average velocity. By aggregating over track segments, an overall average velocity of trains is determined. Let us denote this velocity, V . The classification yard model predicts total delay (in hours) to cars passing through a yard. We have denoted this delay by T , as shown in equa¬ tion (34). Since there is effectively no distance involved in this segment of the trip, however, this time value is not directly expressable as a velocity. To obtain an overall velocity figure, we require one additional piece of information, the average length of haul. We can then obtain average velocity (miles per hour) by dividing average length of haul (miles) by aver¬ age total time in system (hours moving in trains and waiting in yard.) If we denote average length of haul by L, overall average velocity, V, is computed as shown in equation (44); L + V T a y (44) 44 Equation (44) reflects two major simplifying assumptions which are justified by the uncomplicated nature of operations on the railroad under study. The first of these is that each shipment passes through one classi¬ fication yard. This is essentially accurate for the system examined in the case study, but would require modification for more complicated rail operations. Secondly, in equation (44) average time spent in trains is com¬ puted as L/V , rather than by observing which line segments would be crossed SL by a given shipment, summing those transit times, and then computing a weighted average based on relative frequency of various origin-destination pairs. Again, the simpler computation used in equation (44) is a reflec¬ tion of the simplicity of the rail network under study and the relatively limited set of Origin-destination pairs. For this case, the simpler com¬ putation is quite adequate, but it would have to be modified in a more complex setting. 45 CHAPTER 3 DESCRIPTION OF THE DATA FOR RAILROAD X 3.1. General Overview of Data Needs The analysis of the previous chapter provides the following list of data items needed by our model : 1) Flows of various commodity types - (yl' ' " ' 'ym]L) ' 2) Prices of input factors- / v v \ . 3) Levels of fixed factors - t f f ^ (xl Vn^ ; 4) Levels of variable factors - , v v v (x, , • • • ) ; T. 5) Engineering Data ; 6) Cost - C . Each of these areas will be addressed in turn. To provide a specific cost func¬ tion for the discussion, chapter four will present a translog model of the cost function c(y,s,p1,p2,p3,p4,p5;Qk) where 1) Y is total flow (loaded car-miles); 2) S is speed; 3) P is the price l of cars, fuel, crews, locomotives, and non-crew labor respectively; 4) QK is a quality of plant representing the fixed factors. Appendix A presents the esti- 46 mation results for the translog model of C(Y^,,Y^,Y^,S,P ^,P^,P^,;QK) where the Y^ are subaggregates with respect to commodity type. Observations for both models are monthly observations from 1969 through 1977 (108 observations). 3.2 1 Flow Historically, ton-miles has been used as the measure of output of a trans¬ port firm. This measure suffers in many ways: 1. Shippers do not often buy tons of a commodity moved; they usually buy in car-loads, which can vary in weight by the type of commodity. 2. A ton-mile can be misleading: is 100 ton-miles the movement of 100 tons over 1 mile or the movement of 1 ton over 100 miles or something in between? These outputs are clearly not the same. 3. The cost in providing service includes the move¬ ment of empty cars to be repositioned so as to be available to make a revenue-generating move. Thus empty car movements are an intermediate product and not a final output of the firm. Flow data from the firm was of two sorts: 1. Monthly listings by type-of-move (see below) by seven-digit STCC (Standard Transportation Commodity Code) of total loaded cars moved and total tons, from 1969 to 1977. 47 2. Two years (1972, 1977) of records by type of move (see below) of every move made on the system: origination, destination, commodity. We will refer to these as distance profiles. Because shippers basically buy loaded cars rather than tons, we decided to use this measure. Originally we planned to have as disaggregate a model as possible, allowing for flows by line-haul segment and direction. This became impossible to compute and counter-productive to the basic goal of producing an integrated model. Since, in a translog model, log(ab) = log a + log b, then if we included miles with a move we would implicitly have: log(loaded cars) + log(miles) = log(loaded car-miles). Therefore we proceeded to use loaded car-miles as the output measure. The construction of the loaded car-miles was based on using the distance profile information to get an average distance traveled by commodity and by type-of-move. There are four types of move: local (L), forwarded (F), received (R), and intermediate (I). They are defined as follows: L: origin on-line, destination on-line F: origin on-line, destination off-line R: origin off-line, destination on-line I: origin off-line, destination off-line. The model presented in the next chapter uses total loaded car- miles to represent flow; a model using one possible disaggregation of flow is presented in Appendix A. Many such representations of flow are possible allowing not only for disaggregate commodity types but also for distinctions such as unit train, etc. Issues of this type will be 48 investigated in the second phase of the work (see section 6.2). 3.3. Prices of Variable Factors Prices were constructed for the following factors ; 1) Cars ; 2) Rail ; 3) Ties ; 4) Fuel ; 5) Locomotives ; 6) Train Crews ; and 7) Non-Crew Labor. These will be discussed in turn below. It should be remembered that prices are the marginal cost of another unit of the factor in question. As such they should be constructed from national or regional market data. Since most such data for capital items (cars, rail, ties, locomotives) is cost of pur¬ chase, we used the interest rate for the firm, which was provided by the firm's main bank. The rate, while evidencing some fluctuation, appears to have been reasonably stable during the period of estimation (1969-1977) given the nature of the economy. Thus, for capital items the following price construction was used to pro¬ vide monthly prices: P.. = (r + Ô.) Unit Cost. /12 it t i,t where r is the nominal interest rate for year t, is a depreciation rate, 49 unit cost i,t is the unit cost of factor i in year t (when data was found indicating changes in costs during the year, the change was used to split the year and associate unit costs with months). Of course, this leads to some uniformity of some of the capital prices during a year. 3.3.1. Price of Cars Lease information on cars from car-leasing concerns was unavailable for most of the period studied. The firm's car leases were used to construct yearly profiles of numbers and types of cars. Average costs [83] of new freight- train cars installed in the years 1959-1978 by type of car were used in con¬ junction with the yearly car profiles to construct a unit-cost for a repre¬ sentative car, i.e. a car representing the mix of existing stock at each point in time. A depreciation rate of six percent [78] was used. 3.3.2. Price of Rail and Ties Data from the Association of American Railroads on per-ton rail costs and per-tie tie costs was used. Turnover rates on the railroad under study established depreciation rates of .02 and .03 respectively. It was found that the prices were almost perfectly correlated (.987) and thus the price of ties was dropped. Later analysis proved that the price of rail was correlated with the prices of cars and locomotives (both > .9). Dropping the price of rail removed some serious multicollinearity in the model in that almost all other variables had much lower correlations. 3.3.3. Price of Fuel Firm records provide monthly purchases of fuel. Price was taken as amount paid over quantity purchased. 50 3.3.4. Locomotives Firm records were again used to provide mo.nth-by-month profiles of types of locomotives and numbers in use. The firm has only recently started rent¬ ing locomotives. Data from ICC Transport Statistics in the U.S., Table 37 (23 in 1974) was used to provide average costs, by type of locomotive. Again, a composite locomotive unit cost was constructed based on the locomotive mix at each point in time. Missing data (the ICC has stopped issuing the table apparently) was filled by using the few leases the firm did have, which happened to be in the missing data years. The depreciation rate was again .06 [78]. 3.3.5. Labor: Crew and Non-Crew Costs Firm records on monthly payments to various categories of labor were used. Crews were taken as a unit and all other labor (executives on-down) were taken as the unit "non-crew". Prices we computed by taking total hours paid for and dividing by total hours actually worked. This is important since credit is often given for time not actually worked. Per hour supplemental wage payments were added into the wage rates to provide final wage rates (prices). 3.3.6. Deflation of Nominal Prices The prices given above are nominal prices. They were deflated to 1969 by use of the AAR Charge-Out Indices [85]. Fuel was deflated by the fuel index, cars and locomotives by the materials index and labor by the labor index. 3.4. Fixed Factor Levels The fixed factor is the system configuration. In the case of railroad X, 51 a system configuration change occurred in 1976. The change was a simple one involving the addition of a stretch of track which had previously been used by another firm to ship goods onto railroad X's system. A number of pos¬ sible measures of the system configuration are possible. Since railroad X con¬ sists mainly of line-haul, we elected to measure the system quality and con¬ figuration via a vector of four variables representing the number of miles in each of the Federal Railroad Administration's Track Classification categories Clearly, for a given configuration, the elements of such a vector are corre¬ lated. Thus the vector was converted into a scalar measure: „ , _ miles in category four Track Quality = ; — J total miles where category four represents the best quality. Because the FRA classifica¬ tions have associated speed limits, and these speed limits affect the speed of a shipment over the system, this index of system quality is a very effective measure of the fixed factor (though clearly not the only one possible). 3,5. Data Used in the Engineering Analysis The data used in estimation of the short-run cost functions for the rail¬ road under study include information on the physical characteristics of the railroad's lines and yards and operating records of train and car move¬ ments . 3,5.1. Physical Characteristics of the Rail Plant Detailed information on the nature of the rail plant is used in three speci¬ fic places in the model. First, the track profile (grades) and speed limits 52 have been used in computing running times for trains over various segments of mainline track, as described in section 2.4.2 . Second, the number and locations of passing sidings are used in the calculation of delays on single- track lines, as described in section 2.4.4 . All of this information on the physical nature of the plant was obtained from track charts and annotations supplied by the Vice President - Operations of the railroad studied. 3.5.2. Operating Records of Train Movement Records of train movements were sampled for each month of the study period (January, 1969 - December, 1977) in order to estimate numbers of trains operated over each major line segment, and characteristics of those trains including number of cars, locomotive horsepower and total trailing load (in tons). The number of trains operated is an important input to the calculation of line-haul delays, as described in section 2.4.4. The horsepower and total trailing load are important determinants of line-haul train velocity and are also used in line-haul delay calculations. The number of trains and their lengths (in cars) are important input for the classification yard delay models. All of this information was obtained from dispatchers' records of train movements. These are generally large sheets in which all movements of trains on a given day over a particular section of track are noted. These sheets in¬ clude a good deal of information in addition to the data we required, and constitute the most detailed records retained regarding train movements. Because the records comprise many individual entries made manually by dispatchers through the day, and because there is one sheet for each day's operation, ex¬ traction of the relevant information was a laborious, time-consuming process. 53 A sampling procedure was devised to allow extraction of a reasonable statistical sample of this data for each month of the study period. This sampling procedure involved taking detailed information on 8-12 trains per month. This detailed data included origin and destination of run, total horsepower, numbers of loaded and empty cars, trailing tons and delays en¬ countered. Sample trains were selected so as to cover all directions of movement and various days of the week, in order to avoid obvious biases. The sample data for each month was then aggregated to obtain character¬ istics of a "typical" train for that month over each line-haul segment. The horsepower and trailing load for this typical train were then used to com¬ pute average line-haul velocity for a shipment during that month. In addition to this sampling of detailed train movement information, ex¬ haustive samples of numbers and lengths of inbound trains to yard facilities for classification were recorded for 15 days per month. This provided a suf¬ ficient sample to estimate the arrival rate of trains and the first two moments of the train length distribution for each month. These three values were then used in the formula (35 ) to estimate classification delay in yard operations. The number and departure times of trains sampled from the train movement records were also used to construct estimates of the mean and variance of times between successive outbound trains. This is information needed to com¬ pute connection/assembly delay times in classification yards. In summary, the bulk of the basic operating data for the engineering pro¬ cess models has come from two major sources. The first is track charts pro¬ viding the physical characteristics of the mainline track segments. The second is dispatchers' records of train movement. These records include detail on train characteristics and movement from which input values for the line-haul and yard models can be derived. 54 3.6 Cost Monthly records from 1969 through 1977 on operating costs were provided by the firm. These records are used as the basis for ICC submissions. In general, however, such costs do not include implicit capital costs on cars and locomotives, i.e. the costs did not reflect the economic costs of the two major short-run variable capital factors. An estimate of the missing costs was made by using the car and locomotive prices and levels. These costs were then added to the operating cost to provide short-run variable cost. Data on credits, committed funds for leases, etc. were purposely excluded. In general, one would expect that the main part of property-related taxes would be assessed on the firm's plant (rather than equipment) and since this is fixed in our model, we do not include such taxes in the short-run variable costs. Income taxes are on profits, and thus are also excluded from short-run variable costs. This is because the cost function is homogeneous of degree one in prices (see section B.2.1). Thus, the short-run variable costs cover such items as maintenance,fœl, crews cars, locomotives, staff, supplies, etc. Costs were deflated by using the AAR charge-out index (aggregate). Autocorrelation analyses indicated that deflating the costs removed most of the autocorrelation present in the cost observations. A weak yearly auto¬ correlation persisted, but was small enough that we felt it could be ignored in the econometric estimation. 55 CHAPTER 4 THE MODEL TO BE ESTIMATED AND THE ESTIMATION RESULTS 4.1 The Model The following model was estimated: COST: C = a. + a.. PCAR + a„. PFUEL + aon PCREW + a.. PLOCO + acn PMNGT 0 10 20 30 40 50 + eio Y + Y10 s + 610 QK + | a11 (PCAR)2 + a12 PCAR» PFUEL + a±3 PCAR * PCREW + a14 PCAR« PLOCO + a15 PCAR- PMNGT + \ «22 (PFUEL)2 + a23 PFUEL» PCREW + a£4 PFUEL. PLOCO + a25 PFUEL - PMNGT 1 2 + 2 a33 (PCREW) + a34 PCREW. PLOCO + a PCREW. PMNGT 1 2 + - a.. (PLOCO) + a,_ PLOCO • PMNGT 2 44 45 + ^ a55 (PMNGT)2 + isu(ï)2+ ! ,U(S)! + i»uY.S + en PCAR.Y + 021 PFUEL* Y + 0^ PCREW.Y + 0 PLOCO. Y + 051 PMNGT , Y + aL1 PCAR. S + o21 PFUELS • S + PCREW. S + a PLOCO. S + o51 PMNGT » S 56 + ôn(QK)2 + pnQK-Y + enQK-S + nL1PCAR'QK+ n21 PFUEL*QK + PCREW-QK + n, PLOCO-QK + n PMNGT • QK 41 51 where: C = Un (Cost/Average Cost) PCAR = Ln (Price of Cars/Average Price of Cars) PFUEL = Un (Price of Fuel/Average Price of Fuel) PCREW = 2,n (Price of Crews/Average Price of Crews) PLOCO = In (Price of Locos/Average Price of Locos) PMNGT = Ln (Price of Non-crews/Average Price of Non-crews) Y = Ln (Loaded Car-miles/Average Loaded Car-miles) S = Un (Speed/Average Speed) QK = £n (FRA Category Four Percentage/Average FRA Category Four Percentage) There are five prices, two outputs and one fixed factor thus resulting in forty -five coefficients to be computed (see section B.2.4.1). To improve the efficiency of the estimation process we will append the following factor share equations: FUEL: XFUEL = a2Q + a12 PCAR + a22 PFUEL + a23 PCREW + a PLOCO + a25 PMNGT + e2i Y + S + n,lQK CREWS: XCREW = a3Q + a13 PCAR + a23 PFUEL + «33 PCREW + a., PLOCO + a,. PMNGT 43 53 + 831 Y + S + "31 «K LOCOS: XLOCO = aAQ + a14 PCAR + ct^ PFUEL + PCREW + aA4 PLOCO + a5A PMNGT + 041 Y + °41 3 f n4l 57 NON-CREWS : XMNGT = a50 + a15 PCAR + a25 PFUEL + a35 PCREW + a.c PLOCO+acc PMNGT 45 55 + 6S1 Y + 051 S + %1 QK where: XFUEL = Price of fuel* fuel purchased/cost XCREW = Price of crew hour • crew hours purchased/cost XLOCO = Price of loco hour* locomotive hours purchased/cost XMNGT = Price of non-crew hour • non-crew hours purchased/cost By purchased, we mean hours or amounts paid for. This is particularly impor¬ tant with respect to labor and locomotives since not all time paid for is used. As will be observed from the cost function description, we have divided all variables by their means, i.e. an observation is divided by the mean of the observations before taking the logarithm. This is done mainly to protect the proprietary nature of the data. By so transforming the variables we only affect the intercept term, leaving the important coefficients undisturbed This way, actual costs for the railroad under study are only predictable by those with a proprietary interest while cost relationships are open to perusal by all. In view of this, we will not be publishing the variables means since they add nothing to understanding the cost functions, and only reveal proprietary information. 4.2 Estimation Results The above equations were estimated as a system of seemingly unrelated equations [79] where we assumed an additive error structure. Since the fac¬ tor share equations are derived by differentiation of the cost function, the 58 error term in the cost function does not appear in the factor share equa¬ tions. As in [15] we assume the disturbances are joint normal and esti¬ mate the system using a maximum likelihood technique and thus the results are invariant to which factor share equation is dropped. Table 2 provides the estimated cost functions, while Tables 3, 4, 5, and 6 provide the esti¬ mated factor share equations for fuel, crews, locomotives and non-crew labor respectively. 59 Table 2 Cost Function VARIABLE PCAR PFUEL PCREW PLOCO PMNGT QK Y S (PCAR)2 PCAR . PFUEL PCAR « PCREW PCAR • PLOCO PCAR • PMNGT (PFUEL)2 PFUEL- PCREW PFUEL-PLOCO PFUEL- PMNGT (PCREW)2 PCREW-PLOCO PCREW-PMNGT (PLOCO)2 PLOCO-PMNGT (PMNGT)2 COEFFICIENT 10 20 30 40 50 10 10 '10 11 12 13 14 15 22 >23 24 25 33 34 35 *44 45 55 ESTIMATE .03997 .31748 .04767 .15185 .08354 .39945 -.92323 .08939 -.04843 -.02637 .00526 .00216 .04086 -.02167 ,05928 -.01716 -.02293 -.02422 .10596 -.01861 -.07234 .03863 -.03794 .15617 STD■ ERROR .01123 .00494 .00086 .00130 .00084 .00300 .13530 .07851 .05306 .02267 .00628 .00766 .00765 .01535 .01022 .00835 .00706 .00133 .01404 .00729 .01491 .00936 .01019 .02461 60 Table 2 (continued) VARIABLE COEFFICIENT ESTIMATE STD. ERROR (Y) (s)2 Y. S PCAR • Y PFUEL*Y PCREW*Y PLOCO-Y PMNGT"Y PCAR* S PFUEL* S PCREW* S PLOCO.S PMNGT* S Z then x i Q(z) , i.e. the isoquants only re¬ flect efficient production. 2 We assume the following properties for f(x) : 1) f (0) = 0 ; 2) f(x) is continuous with continuous first and second derivatives (unless explicitly stated otherwise) ; 3) if x1 > x then f(x^) >_ f(x) ; 4) f(x) is quasiconcave [57], i.e. f(Ax^+ (l-A)x^) >_ min [f(x^) ,f(x^)] 1 2 for all x , x ^.0. B-l The first property states that positive production requires at least some positive inputs. The second condition imposes regularity on the function while the third condition means that more inputs will not result in less being produced. The fourth condition means that level sets of f (i.e. combina¬ tions of x that provide at least a specified output) are convex sets. This in turn means that the isoquants are convex functions, i.e. they resemble Figure B-la rather than Figure B-lb: Figure B-l ISOQUANTS Many firms, and in particular transport firms, produce a vector of out¬ puts rather than a single output. Transport firms, for example, move a variety of commodities to and from various geographical points. Furthermore, associated with the various commodity flows are characteristics of service such as speed of delivery, schedule unreliability, loss and damage, etc. Let the firm's output vector be the non-negative m-vector of flows and characteristics z = (z,,...,z )'. A transformation function T(z,x) is a I m mathematical model of the relationship between the input vector x and the output vector z. The vector z can be exactly produced from x if B-2 T(z,x) = 0 (B-2) which is the analogous statement to (B-l) above.^ Typical conditions on T(z,x) are as follows:"' 1) T(z,x) <_ 0 Vx z ' 2) T(z,0) <^0 => z = 0 ; 3) T(z,x) is continuous with continuous first and second derivatives; 4) V^TCz.x) < 0, V^T(z,x) > 0 ; 5) V(z) = {x| T(z,x) <_ 0} is a closed, strictly convex set. The first condition simply defines what we mean by producing z from x. This condition allows for both inefficient production (T(z,x) < 0) and efficient production (T(z,x) = 0). Condition (2) is analogous to condition (1) for production functions.^ Condition (3) is the regularity condition analogous to condition (2) for production functions. Condition (4) which requires T to be decreasing in z and increasing in x is a stronger condition than condition (3) for production functions. Finally condition (5) is analogous to condition (4) for production functions: it will guarantee that a unique joint cost function exists (section B.2). In what follows we examine various possible characteristics of pro¬ duction and transformation functions. The characteristics concern the way in which inputs combine to produce output(s). The main considerations in characterizing technology are as follows: 1. Does the technology exhibit economies (or diseconomies) of scale of production for various levels of output? B-3 2. Under what conditions can a vector of outputs be aggregated into a scalar (e.g. ton-miles)? 3. Under what conditions can parts of the input vec¬ tor be aggregated? For example, must we repre¬ sent each and every type of labor used, car type used, etc. or can we use models that have aggre¬ gate labor and capital inputs. This report addresses some of the above questions in detail. Others will be addressed more fully in the second year of the research. B.1.2. Characterization of Production and Transformation Functions Before proceeding to examine some of the characterizations of produc¬ tion and transformation functions we provide the following definition, which will be of use later in this section: Definition. Let H(u,v) = 0 be continuous and differentiable with VH(u,v) ^ 0. The marginal rate of technical substitution (MRTS) of v^ for v^ is 3H/3v MRTS (u,v) = J 8H/3v. J while the marginal rate of transformation (MRTr) of u^ for u. is 3H/3u. MRTr..(u,v) = — lj 3H/9Uj Thus, in the case of the production function we will refer to MRTS (x) = f i j i where f^ = 9f(x)/8xk while in the case of the transformation function we may B-4 also be interested in MRTr^Czjx) = (3T(z,x)/Sz^)/(3T(z,x)/3Zj). It is im¬ portant to note that we have assumed that VH / 0. While this is a stronger condition than condition (3) on production functions, most production func¬ tions satisfy this requirement. In general Vf / 0 will hold for most of the analyses; situations wherein this is not true will be noted. B.. 1.2.1 Homogeneity and Almost Homogeneity A production function f(x) is homogeneous of degree k (H.D.k) if Xkf(x) = f (Xx) VX>0 where X is a scalar. The above condition states that multiplying all the inputs by a positive scalar multiplies the output by a power of the scalar. We observe the following classical categorization 1) k > 1 => increasing returns-to-scale ; 2) k = 1 => constant returns-to-scale ; 3) k < 1 => decreasing returns-to-scale . H.D.k functions satisfy^ Euler's Theorem (see, e.g. M ): kf(x) = x'»Vf(x), Further it can be shown that the partial derivatives are H.D.(k-l). Thus we see that: f.(Xx) Xk ■'"f.(x) f (x) MRTS..(Xx) = — = 1 = — = MRTS (x). ij fj(Xx) X fj(x) fj(x) 2 Thus the marginal rate of technical substitution is unaffected by changes in scale (i.e. it is H.D.O in x). This assumes Vf ^ 0. Since we typically will B-5 take Vf > 0, then MRTS..(x) > 0. Production functions that have regions ij wherein MRTS^(x) < 0 are said to have non-economic regions since it will not generally be profitable to operate in such a region. Homogeneity of degree k for a production function provides the intui¬ tion for the following definition of almost homogeneity for the transforma¬ tion function, namely a transformation function is almost homogeneous of degrees k^,k2 and k^ (AHD(k^,k2.k^)) if and only if: k-i ko k V T(X z ,X x) = X T(z,x) X > 0 . Lau has shown that such functions satisfy a modified Euler's Theorem [53] in: that T(z,x) is AHD(k^,k2,k^) if and only if : k^z^*VzT(z,x) + k2x"*VxT(z,x) = k^T(z,x) . In general, since T(z,x) = 0 for efficient production, we will also refer to T(z,x) as AHD(k,l) where k = k^/k-j if it satisfies either of the above state¬ ments. It is also possible to show that MRTS..(z,x) and MRTr..(z,x) are inde- ij ij pendent of scale if T(z,x) is AHDCk^k^k^) . B.l.2.2 Homotheticity Homotheticity is a very important generalization of homogeneity. Many of the more popular production functions are homethetic, and homotheticity of the production function results in a very special structure for the cost function. Homotheticity was initially developed by Shephard [73], A production function f(x) is homothetic if there exist functions d(u) and h(x) with u a scalar, d(u) monotonically non-decreasing and h(x) H.D.I such that: B-6 f(x) = d(h(x)) Vx . In other words if f(x) can be written as a rescaling of a H.D.I function, it is homothetic. All homogeneous functions are homothetic. The reverse is not true; let d(u) = eU and h(x) = x (x of size one). The resulting function is homothetic but not homogeneous of any degree. O If Vf £ 0 then f is homothetic if and only if MRTS..(x) is H.D.O. V. _ 1J in x i,j [52]- Thus independence of scale of the MRTS is a property of homothetic functions. Geometrically, this means that isoquants are radial expansions of the unit isoquant, i.e. Q(Z) can be geometrically constructed by passing rays from the origin through Q(l). This is shown in Figure B-2. Figure B-2; HOMOTHETIC ISOQUANTS This again illustrates the rescaling concept behind homotheticity. B-7 This intuitive notion underlies the definitions put forward by Shephard [73 , Ch. 10] and Jacobsen [47] and an alternative, more general notion in McFadden [33,Ch.I.1] (which is attributed to Hanoch). Shephard's definition is a straightforward extension of the single output definition 9 to transformation functions that are input/output separable, i.e. we assume T(z,x) = g(z) - f(x). Further let f(x) be homothetic and let g(z) have properties (1), (2) and (3) of a production function with the added proper¬ ties of: (4) quasiconvexity (i.e. -g(z) is quasiconcave); (5) if z' >_ z and z'^ z then g(z') > g(z) and (6) as z becomes arbitrarily large, so does g(z) (i.e. g(z) unbounded for unbounded z). Notice that the function g(*) acts as an aggregation function on z; such a function may not exist. The basic notion of the definition is to place the homotheticity properties in f(x) and use g(z) as a surrogate output measure. McFadden's definition, on the other hand, does not require input/output separability. A transformation will be input-homothetic if there exists a function a(A,z), A a positive scalar, with a(A,z) increasing in A and a(0,z) =0 such that: V(z) = a(|Iz|I,z/1 Iz|I) V(z/1 IzI I) where V(z) = (x|t(z,x) _< 0} (the input requirements set of footnotes 3 and 4) and I Iz|I is a norm of z, e.g.: = (I <> 2 1/2 i=l 1 This definition is most easily understood by examining the single output case. Here z = Z, ||z|| = Z and therefore z/||z|| =1. Then the definition reduces to : B-8 v(z) » a(z,i)v(i). Thus a(Z,l) is the scaling effect on the unit isoquant represented by V(l). In the multiple output case z/||z|| is a normalized output and a(||z|| , z/||z||) acts as a scaling multiplier. The two definitions will have somewhat different effects on the struc¬ ture of cost functions to be discussed in section B.2. The two conditions are the same when T(z,x) is separable in inputs and outputs, which is one type of separability to be discussed below. B.l.2.3 Separability of the Production Function The literature on separability (i.e. the ability to construct aggre¬ gate variables from disaggregate variables) is extensive; we will not attempt to review it in depth here. Instead we will provide a very basic overview of the area of separability which will be a primary focus for the second year's work. Issues of functional structure were addressed by Leontief [56] and Sono [74]. An excellent overall reference is Blackorby, Primont and Russell [ 6 ]. Two questions addressed by the literature are as follows. 1) Under what conditions can one rewrite the function Z = f^ 9 • • • 9 as z = fCgj^ 9 • • • 9 xi)'g2(xi+l 9 • • • 9 9 • • • 9 h(\ or perhaps as Z - f( £ gj(x. i...»x, )) 5 i Ji 1 B-9 in other words, form subaggregates (for example a labor variable to represent all different types of labor) of non-overlapping subsets of variables? 2 ) Under what conditions can one separate inputs from outputs in a transformation function, i.e. when can we write T(z,x) = g(z) - f(x) ? Notice that both g and f act as aggregation functions with an aggregate input being just balanced by an aggregate output. Many technologies are apparently not separable this way (see, for example, [33, Ch. V.l], [10] and [13]). Two types of separability have dominated the literature: weak and strong (see [6 ] for others). To define these, let P be a partition of the N indices of the a function h(x) : N = (l,. .. ,n) Thus this partitions the x-vector into p parts, i.e.: in correspondence with the partitioning of the indices. Separability will be concerned with the effect of changes of a variable on the MRTS of other 1 P with 1) N. H N. = 0 i 3 i ^ j (mutually exclusive) 2) U U ... U ^ = N (exhaustive) . B-10 variables, i.e. we will examine when 3(h./h.) 3^L- = o (Vh + 0) )xk where, as usual, the subscript on h refers to partial derivative. Now we can define strong (S) and weak (W) separability 3(h./h ) h is S if ^— = 0 3xk i e N eN. and V h is W if 3(h./h.) —J- = 0 3xk ki N UN ; u v i,j e N k i N In words, h is strongly separable (S) in the partition P if when we pick variables from two parts of the partition (part and part N^) and com¬ pute their MRTS, it is independent of changes in variables not in either or N^. If this holds for all variables in all the parts of the partition then h is S. Weak separability doesn't require us to have i and j tested in different parts. In other words, weak separability tests each element of the partition against the elements of other subvectors in the partition. Thus, for example, the following function a, a a , / \ 12 n h(x) = x •x ... x 12 n is itself strongly separable and if we form two functions B-ll 1 a9 am , 1 f v 1 2 in h (x) = X. *x0 ... x l 2 m o a ,. a , „ a ,2, . m+1 m+2 n n (x) = x ,. •x ,„ ... x m+1 m+2 n then h is strongly separable in the partition {N^,^} with = {l,...,m} and = {m+1 n} . Goldman and Uzawa [36], have related the S and W conditions to cer¬ tain general functional forms. Berndt and Christensen [4 ] have (for homo- thetic production functions) related S and W conditions to constraints on elasticities of substitution. The elasticities of substitution attempts to measure the sensitivity of the optimal input factor mix to changes in the MRTS. For example, if production is a function of capital (K) and labor (L) alone, i.e. Z = f(K,L) then a, the elasticity of substitution, is defined as [44]: f /f d(K/L) a = - K/L d(MRTSKL) Z fixed When the production function has more than two factors a number of possible measures can be constructed (see McFadden, Ch. IV.1 in [33]). If f(x) is homothetic then the Allen Elasticity of Substitution [ 1 ] (called AES) can be written as n I, V- k-l '• k l(V2Bf) o . = # ij Vj lv2Bf| I 2B where |V f| is the determinant of the bordered Hessian matrix (see footnote B-12 and |(V^Bf)^| is the determinant of the i,j cofactor of V^Bf. It will turn out that the are computable from cost function information. Bendt and Christensen [4 ] relate restrictions on the to issues of aggregation. This work is extended to non-homothetic production functions by Russell [70]. B.1.3. Examples of Production and Transformation Functions In this section we will provide some examples of production functions, culminating with the most general forms currently in use. B.1.3.1 Leontief Production The Leontief (or fixed productions) production function is the following Z = min( — ,, where the a^ > 0. Thus inputs are used in fixed proportions (dictated by the a^). Thus the isoquants (curves in the input space of constant output level) are corner or L-shaped as shown in FigureB-3. There is no substitution between factors: more of any factor will be wasted unless all factors are constant proportions line ,2 "i Ai Figure B-3: LEONTIEF PRODUCTION B-13 increased proportionately. This function is H.D.I (i.e. constant returns-to- scale). Furthermore, if one views the above as a process and there are other processes available (i.e. other processes that entail different proportions) then there is a possibility of substitution between processes, as shown in Figure B-4. Figure B-4: SUBSTITUTION BETWEEN LEONTIEF PROCESSES Thus this production function is not as limited as it seems at first glance. It is the ability to substitute between processes that has made this func¬ tion so useful: linear programming models are based on Leontief production processes (each column in an L.P. being a fixed proportions production pro¬ cess) . It should be noted that this function is not differentiable. Thus, for this case we rely upon our original condition (3) for production functions. B. 1.3.2 Cobb-Douglas Production This function has been extremely popular for a number of years. It is written^ as follows: n cxi Z = A n x la = 1, A > 0 . (B-3) i=l 1 1 B-14 The Cobb-Douglas production function is H.D.I. A more general version is shown below (which is H.D.v): n aj v Z = (A II x ) £a. = 1, A > 0, v > 0 (B-4) i-1 1 1 which is obviously homothetic. In this case any value of returns-to-scale is possible if (B-4)is estimated (see [62]). For what follows, we continue our analysis in the standard Cobb-Douglas,(B-3). First, all factors are essential, i.e. if any one is zero then the function ascribes zero output to the process under study. This is not always a desirable result. Substitution between factors is possible (though complete substitution is not since all factors are essential). The elas¬ ticity of substitution, cr^, is constant and equal to one for all i,j pairs. Christensen and Greene [15] test for unitary elasticities of substitution by constraining certain coefficient estimates in their cost function esti¬ mation. We will return to this later. B.l.3.3 Arrow-Chenery-Minhas-Solow CES Function and the CET/CES Transfor¬ mation Function Motivated by certain empirical evidence of the relationship between the log of value added per labor unit and the log of the wage rate, Arrow, Chenery. Minhas and Solow developed a general production function that was H.D.I, had a constant elasticity of substitution (not necessarily equal to one) and satisfied a simple model that explained (to some degree) the empiri¬ cal results. The production function is called the constant elasticity of substitution production function (CES) and can be written as:^ B-15 ^ p l/p A , z = A[ £ aixi ] > °» 1 > P i=l in which case = a = ^ for all i,j. Special cases are the Leontief (when a -*■ 0), the Cobb-Douglas (a = 1) > and the perfect substitute case (a -* + co ) where output is simply a weighted sum of inputs. Powell and Gruen [68] were apparently the first to employ the CES function as a multiple output function to form the CET (constant elasticity of transformation) function. Joining the CET and CES functions, and assum¬ ing T(z,x) = g(z) - f(x) we have: m . 1/b n . k/P T(z,x) = ( Y ô.z.) - A( I a.x,) i=l 1 i=l 1 where k is the degree of homogeneity of the CES function (see note 11). Hasenkamp [42] has estimated such a function using cross-section data on U.S. railroads for 1929 and 1936. B.l.3.4 Flexible Functional Forms: The Translog and the Generalized Leontief Within the last ten years a number of reasonably general production functions have been developed. These are called flexible functional forms (see, e.g. [33, Ch. II.1], [ 6 ]) • A general represèntation of such forms in the following: ^ i|>(f(x)) = a00 + I ai0i(xi) + 2 I I ai.(J)i(xi)(j) (x ) (B-5) i i j J J J where f(x) is the production function, the oc's are coefficients and ty(') and <{k(') are suitable functions. To be more precise we have the following: 12 1) Transcendental Logarathmic Production Function [ 141 \Jj(u) = £n(u) c}3 (u ) = Un u B-16 and thus we have: 13 Anf(x) 2) Generalized Leontlef [22] iJj(u) = u (^(u) = AT yielding: f(x) = a 00 + \ ai0 + 2 II aij /Xi "S • i 1 j J These functions can either be viewed as exact representations of technology or as approximations (second order) to an arbitrary technology. Lau [54] indicates that two notions of approximation have been used. McFadden [33, Ch. II.2] has viewed the flexible form as a second order approximation if first and second derivatives of the approximation are the same as the true function at the point of approximation. Christensen» et al [ 14] have viewed the flexible form as an approximation in the sense of a Taylor's series expansion. In both views, the notion of an approximation is a local notion. While certain functional properties are globally inheritable by an approximation, a significant caution must be observed. Simply put, the factors that con¬ tribute to a good approximation and the factors that cbntribute to a good statistical estimation can be diametrically opposite to one-another. Approxi¬ mations which are locally good depend on tightly packed data. On the other hand, good experimental design procedure usually calls for as great a dis- B-17 persal of data points as is possible. This point is recognized in [33, Ch. II.1]. One of the major justifications for using the flexible forms is that many of the more standard production functions are sub-cases when restric¬ tions are placed on the form. For example, in the translog (transcendental logarithmic) form, setting the second order coefficients to zero (i.e. a„ = 0, i,j 1) yields the Cobb-Douglas case. The translog is also an approximation to the CES production function and others (see [14]). A more detailed review of such forms, their advantages and their failings is given in [33, Ch. II.1]. As an example of estimating and testing a model, consider the trans¬ log production function above. It is easy to show that the restrictions for the function to be H.D.I are (assuming a„ = see note 12): *> I °i0 ■ ^ 2) J a..=0 i = 1,...,n . • il 3 To further restrict the form to be Cobb-Douglas (i.e. unit elasticities of substitution), one sets a_ = 0 for all i,j. Thus a procedure would be as follows . 1) Estimate the unrestricted function (with a.. = a..) . 13 3i 2) Estimate th'e model with the H.D.I restriction and test the new model against the unrestricted model (using a F-test or a likelihood ratio statistic). 3) If the restricted model can not be rejected, pro¬ ceed to the next set of restrictions; otherwise stop. B-18 B.2. Cost B.2.1. Definition of the Cost Minimization Problem and Related Cost Functions: Long and Short Run Cost Functions Let p = (p , ...,p )' be an n-vector of given factor prices, i.e. the 14 firm cannot affect p through individual firm actions. Moreover, we assume p > 0. Finally we assume that the firm attempts to use the factors of pro¬ duction as efficiently as possible, i.e. for any specified output level, the firm chooses the input vector that produces the required output at minimum cost. Since cost is p'x = Ip^x-^ then the firms cost minimization problem (CMP) is as follows: (CMP) min p'*x x s.t. T(z,x)£0 where p and z are given. Here we have written the problem for a cost mini¬ mization over a transformation function. We shall continue with this form * with the understanding that the production function case is a subcase of (CMP). The conditions on T(z,x) guarantee that the solution to (CMP) exists and is unique: the objective function is linear and the set of x in the constraints is convex. If we vary z, holding p fixed a function is traced V * * / out relating minimal cost C = ip/x^(z9p) (where x^(z,p) is the optimal solution to (CMP) for given z and p) to z and p. This is called the cost function (or long run cost function to indicate that all factors have been allowed to adjust to optimality) and is written: C(z,p) (B-8) B-19 The conditions on T(z,x) imply that the following characteristics of C(z,p) can be proved (see [73], [33, Ch. 1.1], or [81]): V 1) C(z,p) is H.D.I in p, i.e. C(z,Xp) = XC(z,p) X > 0 ; 2) C(z,p) is monotonie nondecreasing in p: p' >_ p => C(z,p") >_ C(z,p) ; 3) C(z,p) is concave in p, i.e. C(z, <5p + (l-ô)p') >_ 6C(z,p) + (l- 0) . The first condition reflects the obvious result that if all prices increase by the same proportion, so will the costs since the uniform price change will not affect the choice of the factor levels (since relative prices didn't change). The second condition is also straightforward: since C(z,p) repre¬ sents minimal costs, one should not be able to reduce costs by increasing factor prices. The intuition for the third property takes more effort. Let (p,x ) be the price and optimal quantity of inputs for some output level. This results in a cost C . Now if, say, just p^ is increased slightly (to p^), then a slight reduction in x^ will have to be made, thereby not changing costs in proportion to the slight p^ change. Figure B-5 illustrates the effect (see [33], [81]). Thus C(z,p) is concave. Continuity (property 4) follows from the concavity of C(z,p) (concave functions are continuous, ex¬ cept possibly at the boundary). B-20 cost C C(z,p) Figure B-5: COST FUNCTION CONCAVITY An observation is in order. The analysis above and that which will follow rests on two important assumptions: (1) the firm faces fixed, known factor prices; (2) the firm minimizes costs. The assumptions cut two ways. On the one hand, we may develop cost functions for monopolists as well as perfect competitors: no issues about the market(s) for the out¬ put were raised. Moreover, as long as the entity being studied is trying to efficiently produce output we need not concern ourselves with problems of whether the firm is profit-maximizing or regulated to provide "socially optimal" output. However, it is important that the firm face reasonably competitive factor markets, something that may not be true for very large firms. We can now define some standard related cost functions: 1) Marginal Cost: MC^(z,p) = i i = 1 ,•••,m ; 2) For single output models Average Cost: AC(Z,p) C(Z,p) Z Z > 0. B-21 Using Shephard's lemma [73], [81] the optimal factor demand equations * x^(z,p) are simply: a 9C(z,p) x.(z,p) = i = l,...,n . 3Pi Some manipulation also shows that the elasticity of cost with respect to a factor price is: 3C(z,p) p A P. = x.(z,p) (B-9) Sp.^ C(z,p) 1 C(z,p) * p x (z,p) (B-10) C(z,p) which is simply the factor share, i.e. the percentage of cost spent on fac¬ tor i. Notice that the left-hand-side of (B-9) can also be written: 8 log C(z,p) 3 log p± which will be especially useful in translog cost function studies (where log C(z,p) is expressed in terms of log z^ and log p^). If we restrict z to. be a single output Z then a graph of a typical C(Z,p) can be drawn as shown in Figure (B-6). The cost function illustrated reflects economies of scale (increasing returns-to-scale) for outputs up to Z and diseconomies of scale for outputs greater than Z. The result is a classical U-shaped average cost function with a minimum at Z = Z. This will be the optimal size of the firm. The above cost function represents the cost of producing output z given factor prices p assuming all factors are free to adjust their levels so as to minimize cost. This assumption is not always valid. Regulated common B-22 Figure B-6: COST FUNCTION carriers often cannot adjust their capital stock through abandonment, for example, of service. There can be any number of reasons why, at least for a short period of time, a firm can not optimally adjust certain factors of production as it increases or decreases its output. This is an especially important issue when we try to estimate the firms C(z,p) function, since this means that some of the observations will lie on C(z,p) but some of them will lie above C(z,p). Notice that no observations could lie below C(z,p) by the definition of the function. Therefore if we attempt to pass a curve through a scatter of points we are doomed to overestimate the cost function. This was recognized by over a decade ago by Eads [24], Eads, Nerlove and Raduchel [25] and Keeler [48] and has been employed by a number of investi¬ gators since then (see [49], [41], [60], [32], [12]). B-23 To formulate the short-run cost function, we partition x into two subvectors: xV and x^ (for variable and fixed factors): x = where xV is of dimension n^ >_ 1 and x^ is of dimension n-n^. Now (CMP) be¬ comes the short-run CMP (SRCMP): (SRCMP) min p'»{ v. ' x i f s. t. T(z ,xV,x^) <_ 0 . Since the partition of x induces a similar partition on p and since (p^)'x^ is fixed then SRCMP becomes the short-run variable cost minimization problem (SRVCMP) min (pV)' xV s.t. T(z,xV,xf) <_ 0 . Again if we vary z the result is a cost function C(z,pV;xf). Note that v only p shows up in the cost function. The notation shows that the cost function is conditioned on the values of the fixed variables xf. Total short-run costs are equal to short-run variable cost plus short-run fixed costs : TC(z,pV; xf) = C(z,pV;xf) + (pf)'-Xf Observe that short-run marginal costs MCi(z,pV;xf) can be calculated from either TC(z,pV;xf) or C(z,pV;xf). Thus MCi(z,pV;xf) = -K(«,P ;x I i = 1,... ,n 1 B-24 If there is a single output, average cost is well defined and we have the definitions of short-run average cost, short-run average variable cost and short-run average fixed cost: TC(Z,pV;xf) 1) AC(Z,p ;x ) = Z > 0 ; Z C(Z,pV;xf) 2) AVC(Z,p ;x ) = Z > 0 ; Z t f\- f vf (P ) x 3) AFC(Z,p ;x ) = Z > 0. Z Again, Shephard's lemma yields the short-run factor demand equations and the short-run factor share equations for the variable factors (now with mul¬ tiple outputs): v* v f. 9C(z,p ;x ) X± (z,p ;x ) = i = 1, • • • ,n^ 3pi V V*, V fv r, - v f. v v f PiXi ^z,p ;x ^ 8 C^Z'P >x ) S (z,p ;x ) = ~—2— = i = l,...,n C(z,p ;x ) 81og p1 where S^(z,pV;x^) denotes the share of costs attributable to variable factor i. Finally, the short-run functions provide the long-run function: C(z,p) = min(C(z,pV;xf) + (pf)'-xf) . (B-ll) xf Thus, estimating the short-run variable cost function provides esti¬ mates of the short-run marginal cost functions, the factor demand and factor share equations. It should be noted that the estimated share equations from the variable cost function will not be the same as the estimated share equa- B-25 tions from the total cost function. In fact the following relationship holds : , C(z,pV;xf) , Slog TC(z,pV;xf) „v. v f. >■»*•>/ v f. = S (z,p ;x ) —- = S (z,p ;x ) - - TC(z,p ;x ) 1 Slog P^ i = 1,•••,n^# Thus, caution must be used in interpreting the estimated equations. Furthermore, if p^ is known, the long-run cost function can be re¬ covered by solving the optimization problem in (B-ll) above. Thus by speci¬ fying a technology and a vector of prices, we can derive functional forms (in some cases explicitly as will be shown below) that can be estimated. B.2.2. Cost Functions and Implied Technology In the previous section we defined cost functions for technologies that were convex in their input factors (e.g. for production functions with isoquants as depicted in Figure B-la). If the technologies are not convex in the sense shown in Figure B-lb then the cost function will be derived for the convexified technology. This is illustrated in Figure B-7. B-26 This means that all the technologies that have the same convexification have the same cost function. This is really not a problem, as McFadden [33, Ch. 1.1] points out, since it can be shown that if the firm is facing given input factor prices and minimizing costs, then the firm never would choose an input mix that would place it in the non-convex region, i.e. it would act as if it worked with the convexification anyway. Thus, while a number of technologies can give rise to the same cost function, the convexified tech¬ nology is all we need care about since the firm (if it obeys our assumptions on factor prices and cost minimization) would never be observed operating in the non-convex region anyway. Now consider instead what information you could draw from a cost function C(z,p) . If you were given the function and told that it came from a cost minimizing firm that faced fixed prices, you could form the follow¬ ing set: £ V (z) = {x|p'x >_ C(z,p) for all p > 0}. Geometrically, V (z) is illustrated in Figure (B-8), A. x, '2 p'x for some value of p p'x for some other value of p x 1 Figure B-8: CONSTRUCTING V*(z) B-27 i.e. it is the set of points lying above all the straight lines. Thus, if we pick an output vector z and vary p and look at all the x that are north- k east of the lines p'x, we have V (z). There are an infinite number of such lines and the result is a curve that looks very much like an isoquant and the region to the northeast of it, as seen in Figure B-9 below. Figure B-9 k It can be shown that V (z) is always convex, irrespective of the technology that gave rise to C(z,p). It also satisfies the properties that we require k of a technology (see [81j). In fact we will call V (z) the input require¬ ments set (see footnotes 3 and 4) of our implied technology. We now have the following very important duality results(see, e.g. [81]). *-28 2) If the original technology is convex in its inputs then the implied technology will be identical to it. 3) If the original technology is not convex in its inputs then the im¬ plied technology will be identical with the convexification of the original technology. Therefore, a properly constructed cost function will provide all the infor¬ mation of interest about a technology (if the firm obeys our assumptions on fixed factor prices and cost minimization). Put more practically, we can estimate either a production (or trans¬ formation) function or a cost function and get what we want to know about the underlying technology. We can use a cost function to inform us about the following. 1) Homotheticity . 2) Homogeneity . 3) Returns-to-scale . 4) Separability . We will consider these in turn. B.2.2.1 Homotheticity A very useful and interesting result concerning the structure of the cost function occurs if we let the production function f(x) be homothetic, i.e. Z = f(x) = d(h(x)) B-29 where d(*) is a monotonie increasing continuous function and h(*) is H.D.I. It can be shown [73] that if f(x) satisfies the above, then there is an in¬ verse function to d (which we'll call s(*)) such that h(x) = s(Z). Now s(Z) is simply a scalar so we have the following result: C(Z,p) = min (p'x|f(x) = Z} x = min {p'x|h(x) = s(Z)} x = min (p^x|h(s(.*y) = l) = s(Z)* min{p^w|h(w) = l} w = ^z)" w = s (Z)• &(p) To explain: the first line is a statement of (CMP), the second line the result of the transformation discussed above. In the third line we capi¬ talize upon h(x) being H.D.I and s(Z) being a scalar. Thus h(x) = s(Z) means h(x/s(Z)) = 1, i.e. if we divide every element of x by s(Z) then the output is 1. In the fourth line we change variables letting the vector w be the vector x with every element divided by s(Z). To maintain the equality we must multiply by s(Z). Finally in the fifth line we recognize that the minimum on line four will be purely a function of p (since w will be optimized out). Shephard has proved a more general version of the above (and its converse) and thus we have the following theorem. f(x) homothetic <==> C(Z,p) is multiplicatively separable in z and p, i.e.: C(Z,p) = s(Z) • S,(p) B-30 Note that since C(Z,p) must be H.D.I in prices p we know that Up) is H.D.I. Furthermore s(0) = 0, s(Z) > 0 if Z > 0, s(Z) is continuous, etc. from the properties of the cost function. This result can be extended to the transformation function case. Re¬ call that Shephard assumed T(z,x) = g(z) - f(x). In this case we have that C(z,p) = g(z)*£(p), i.e. again, C(z,p) is multiplicatively separable if and only if the separable transformation function T(z,x) is homothetic in x. Notice also that g(z) is the aggregation function for the vector z. If the transformation function is not separable then input homotheticity gains us somewhat less. McFadden [33, Ch. 1.1] shows that in this case C(z,p) = a(||z||,z/||z||)C(z/||z||,p) where a(*,•) is the scaling function discussed in section p.1.2.2 above and ||*|| is the norm function mentioned there also. What is important here is that tests for homotheticity that rely upon the multiplicative separability of C(z,p) are actually testing separability and homotheticity together; a rejection may be a rejection of separability or homotheticity or both. B.2.2.2 Homogeneity Let f(x) by H.D.k. Then in a manner similar to that in B,2.2.1 we see the following result. C(Z,p) = min (p"x|f(x) = Z} x = min (p'x| f (x/Z^k) = i] x = Z ^min{p'w| f (w) = l} w = x/Z ^ w = z1/k Up) • B-31 Thus, in particular, if f(x) is H.D.I then C(Z,p) = Z &(p) an<3 vice-versa. In other words if C(Z,p) is linear in Z then f(x) is H.D.I. It should be noted that from the above result we have that if f(x) is H.D.k then C(Z,p) is H.D.(l/k). Extending this to the multiple output case provides motivation for the following definition: C(z,p) is output homogeneous of degree r (O.H.D.r) r V if C(Xz,p) = X C(z,p) X > 0 k It is with this definition in mind that we next consider economies of scale. B.2.2.3 Economies of Scale Baumol [ 2 ] has defined the notion of decreasing average ray cost for multiproduct firms. A firm has decreasing average ray costs if: C(Xz,p) < XC(z,p) X > 1 . For example, if we consider the single output case we have: C(XZ,p) < XC(Z,p) X > 1 U C(XZ,p) C(Z,p) . ^ XZ Z A > 1 which simply is the condition of declining average costs (i.e. returns-to- (or economies of) scale). Thus decreasing average ray costs should be asso¬ ciated with returns-to-scale, and they are (see Baumol [2]). Panzer and Willig [64] extend this notion to provide a measure of scale economies for multioutput firms. They show that the following measure captures B-32 returns-to-scale^ in production: S = C(z,p)/ I z± 1 9C(z,p) i ^ "zi Notice that if C(z,p) is 0.H.D.I then S = 1 since by Euler's Theorem the numerator equals the denominator. Rearranging terms yields the following: z 3C(z,p) S 1/'| C(z,p) 3zi _ 1 ,r 9&nC(z,p) . 'j 9£nz^ We shall see later that this function is particularly easy to calculate from a translog cost function. The measure above is the sum of the elasticities of cost with respect to output divided into one. The more inelastic the cost function is to output, the greater the returns-to-scale. B.2.2.4 Separability Analysis of the separability of production and transformation functions can be performed via the cost function. First, considering homothetic pro¬ duction functions Uzawa [80J has shown the following for the AES cr^ (see section 2.1.2.3): C(Z,p)C (Z,p) a, = J ij C1(Z,p)CJ(Z,p) 2r where C^Z.p) = 3C(Z,p)/3pi and C^CZ.p) = 3 C(Z,p)/3p;.3pj • Thus estimating the cost function provides estimates of the AES (the sub-" scripts denote partial derivative with respect to price). Berndt and Christensen [4 ] show that weak separability (i,j e N^, k i N^; see section B.l.2.3) implies a =* a., (i,j e N , k i N ) which is true if and only if XrC J K U U B-33 C. (Z,p)C., (Z,p) - C.(Z,p)C (Z,p) = 0. These conditions provide for weak J i J k separability of the cost function. Further Lau has shown that the cost function is weakly separable (strongly separable) with respect to a parti¬ tion P in prices if and only if the production function is homothetically weakly separable (strongly separable) with respect to the partition P in in¬ puts [33, Ch. 1.3]. Again, this area is extensive; for further information see [6 ], [33, Ch. 1.3], and [37] to name a few references. B.2.3. Examples As has been indicated above, there is a duality between production and cost: technology descriptions give rise to cost functions (when prices are incorporated) which give rise to implicit technologies. In this section we provide cost functions for some of the production functions in section B.l. Furthermore we discuss some of the flexible func¬ tional form cost functions: the Generalized Leontief, the Hall function and the translog. B.2.3.1 Cobb-Douglas Production and Cost The Cobb-Douglas production function of section R.l.3.2 gives rise to the following cost function: n a -1 . n a./v c(z,p) = (a n a. ) z n p. 1 i=l 1 i=l 1 where v is the returns-to-scale in (B-4) insection B.l.3.2. Notice that C(Z,p) could be estimated by taking logarithms: n £n C(Z,p) = 01q + £ Y. in p, +g£nZ i=l B-34 n a. -1 a. where ag = &n(A II ai ) ~ ~ and ® = "^v" For C(Z>P) t0 be H.D.I in i=l prices we would require the constraint = 1. B.2.3.2 CES Production and Cost Referring to section B.1.3.3, the dual cost function would be as fol¬ lows : l/(l-a) C(z,p) =7(1 (p • /a. ) ) A _i 1 i z n 1—O i=l where a = is the elasticity of substitution. Notice that as ff + 0 (the 1-P Leontief case) we get a cost function that is simply a weighted sum of prices times the output level, i.e. for Leontief production Z n C(Z,p) = - I (p./a ) A i=l where the a. correspond to the a. of sectionB .1.3.1. x 1 B.2.3.3 Flexible Functional Forms The above cost functions are examples of what are known as self-dual technologies [46j. The coefficients of the production function appear in the cost function and vice versa and the dual functions are members of the same family. This is not in general the case with the flexible functional form. A transcendental logarithmic production function may not give rise to a translog cost function. The choice of which to use is thus a non-trivial one since it is possible that, for example, estimating a translog cost func¬ tion and a translog production function could lead to different results (see Burgess [10]). B-35 C(Z,p) - Z ^ J aij Sti "u ■ <*ji • In what follows we will briefly describe three cost functions: the generalized Leontief (Diewert [22]), the generalized linear-generalized Leontief joint cost function (Hall [40]) and the translog (Christensen, Jorgensen and Lau [14]). The generalized Leontief function resembles the production function of section B.1.3.4 above. It is as follows: n n Notice that the cost function represents a Leontief production function if > 0 and = 0 for i ^ j . This is the source of its name. While this function is a second order approximation to any technology, it only admits one output. The Hall function is an extension of the Diewert function for mul¬ tiple outputs. It is: c " ii ii I ! ^ ^ ^ ^ • Unfortunately, the Hall function assumes H.D.I production and has a large 2 2 number of parameters to estimate (n m ). Finally the translog is as follows: m n LnC(z,p) = aQ + I ai0£nZi+ £ 3jQ An Pj 1—1 J ""1 mm n n m n + .1 I Yi:j *nz. Anp i=l j=l J J aij " °jf 6ii " 6ji" Observe that if all the second order terms are zero, the translog reduces to B-36 the Cobb-Douglas. In fact, one could view the Cobb-Douglas as a first order approximation to a function (cost or production) and the translog as a second order approximation: both in logarithms. A word is in order on factor demand equations. In general, most studies that estimate a cost function estimate it simultaneously with the factor demand equations. This provides increased efficiency in the esti¬ mation process. The factor demand equations for the Diewert form are quite simple : 1/2 X (Z,p) = Z I a (p /p ) j=l 3 3 This is similarly true for the Hall function. On the other hand, the fac¬ tor demand equations are not simple for the translog: they are non-linear in the parameters to be estimated. However, because the translog is ex¬ pressed in logarithms the factor share equations are linear: Si(Z,p) = gi0 + I By p J=1 J J m + l Y in z . j=l 3 3 Therefore in estimating the translog cost function we can append n-1 factor share equations (since the cost function makes the n1"^ equation). In summary the principle advantages and disadvantages of the three flexible forms are as follows. Generalized Leontief (Diewert [22]): Advantages- 1) Second order approxi¬ mation to a cost function 2) Low number of parameters to be estimated; B-37 3) Convenient form for cost function and fac¬ tor demand equations ; 4) Allows easy test of non- substitution (Leontief) case ; 5) Zero level of variables allowed ; Disadvantages" 1) Assume homogeneous produc¬ tion ; 2) Assumes single output ; 3) It is separability-inflexible (see below). Generalized Linear-Generalized Leontief (Hall [40]): Advantages- 1) Second-order approximation; 2) Multiple outputs; 3) Convenient form ; 4) Tests input/output separability; Disadvantages- 1) Assumes constant returns-to- scale; 2) Large number of parameters to be estimated. B-38 Translog (Chrlstensen, Jorgensen and Lau [14]): Advantages- 1) Second order approxima¬ tion ; 2) Reduces to popular forms (e.g. Cobb-Douglas, CES as limiting case); 3) Reasonable number of para¬ meters ; 4) Convenient form for esti¬ mating economies of scale; 5) Multiple outputs allowed; Disadvantages- 1) Zero levels of variables not allowed (due to logarithms) ; 2) Factor demand equations non-linear in parameters (though factor share equa¬ tions are not) ; 3) It is separability-inflexible. Separability-inflexibility for the translog (see [6 ] and [19]) and the Diewert [6 ] means that imposing separability restrictions implicitly im¬ poses significant structure on the aggregation functions themselves. Thus, not only is the separability of different variables being tested: the test of separability will also be testing a specific structural form for the aggregation functions. Thus rejection of the test may only reflect rejec¬ tion of the forms, not the separability itself. B-39 B.2.3.4 An Example of Finding Long-Run and Short-Run Cost Functions To help tie-together some of the notions discussed in the sections above, this section presents a long- and short-run cost model. Assume a firm uses capital (x^, labor (x2) and fuel (x3) to produce a single out¬ put (Z) following a Cobb-Douglas production function: The long-run cost function C(Z,p) with p = (p^,p2>p3)' is found by solving 1 2 3 Z = A x^ x2 x3. • (CMP): (CMP) minimize P^xi + P2X2 + P3X3 subject to which yields where v = + a2 + a3 and al a2 a3 aQ = (Aa3 a2 a3 ) 1 2 3 Thus the factor demand equation for labor (x2) is: «2 C(Z»P) B-40 which means that the factor share equation for labor is _ p2x2(Z,p) a2 s2(z,P) C(z,p) which could also have been found by taking logarithms of the cost function and then computing 3log C(z,p)/31og p2> The short-run cost function is found by fixing one or more of the vari¬ ables. If we fix capital (x^) at a given level x^ then (SRVCMP) becomes (SRVCMP) min p2x2 + p^ _ox a2 a3 s.t. Ax^ x2 x^ = Z which yields the short-run variable cost function: 1/ c9/u a./u _-a1/u C(Z,p jx^ = 6qZ u p2 p3 x1 where v t \ - p = ^p2,p3 a, a, 1/u S0 - u(Aa2 a3 ) u = a2 + a3 It can be readily shown that we can derive the long-run cost function: C(Z,p) = min (C(Z,p;x1) + P-^) • X], The short-run factor demand function is : ao C(Z,pV;x1) ¥*•'= V * if p, ' This means that the factor share function is as follows: B-41 V - S Si(Z,pV;x1) = ~ . Calculating returns-to-scale on the long-run function yields the following = C(Z,p) = C(Z,p) = Z -8C(3zP) v C(Z'P) which is the returns-to-scale parameter (see section B.l.3.2). The system of equations to be estimated for the short-run variable cost function is as follows: log C(Z,pV;x1) = Y0 + Y1logZ + y2log p2 + Y3l°g P3 + Y4l°g *1 + P2X2 = Yo + e< C(Z,pV;x1) 2 where Yq»Y3jY2 and Y 3 are t0 be estimated, the are error terms and we have used the factor share equation for labor since there are only two fac¬ tor share equations (labor and fuel, i.e. n=2) and thus n-l=l. The above system can be estimated as a seemingly unrelated equations system [79] - To enforce H.D.I in pV we would add the condition that y2 + Y 3 = !• Since ^ A A A A 1/Yj^ will estimate u, we could recover a2 and from y2 and Y3 ( means estimated value). B.2.4. The General Form of the Short-Run Variable Cost Function to be Esti¬ mated and the Associated Conditions to be Tested In this report we present results of estimating a short-run variable cost function (see chapter 4). In this section we will provide the basic form for the cost model which will be made more specific later. We will B-42 also present the conditions on the model for homogeneity in prices, joint separability and homotheticity, homogeneity of degree k and unitary elasti¬ cities of substitution in inputs. B.2.4.1 The Translog Short-Run Variable Cost Model We have chosen to use the translog model for the reasons described above in section B.2.3.3. The model is as follows (£n means natural logarithm) n i ^ m 1 £n C(z,pV;x ) = otn + Y a.Jlnz. + Y 3 .M£npY 0 .L, iO l ,L iO l i=l i=l n-n. V f + Y Y .n£n x. .u , iO l i=l ^ m m + — Y Y a. .£n z. £n z. 2 lii jii 12 1 J ni ni + \ \ 1 8.,£n pY £n pT 2 i=l j=l 1 J n-n^ n-n^ r J 1 Y-£n xf 2 i=l j=l 1 J m + I I £n pY £n z. i=l j=l ^ 1 J m n-n r r 1 zx f + Y Y ô. . £n z. £n x. • 1 • i !! 1 J i=l j=l J J n n-n + I I ÔPX £n PV £n xf i=l j=l 1 J B-43 » with a. . = a .., = B .. and Y . = y ... This results in a total of ij ji 13 ji ij Ji m(m+3) + n(n+3) + mn + i parameters to be estimated. til The factor share equation for the i variable factor is as follows: n m v. fv _ „ , r „ v . r XVZ0 3I(z'PV;x > = eio + I 3i/n pj + 6ij*n zj 3 3 n-n- + J 6PX In xf . j-1 « 3 B.2.4.2 Constraints v 1) Homogeneity of degree 1 in p : nl (a) lml* iO = 1 "l (b) I B,, = 0 j = 1 n i=l 1J L nl (c) l 6^ =0 j = 1 m i=l iJ px r» r (d) J 5 =0 j = 1,...,n-n i=l 1 2) Separability of T(z,x) and Input Homotheticity dz zx ^. = 6. =0 v . . ij ij ij B-44 k 3) Homogeneity of degree k (k given) in production (X Z = f(Xx)) If and almost homogeneity of degree (k,l) (T(X z,Xx) = 0); m nl (a) k I a + I yl0 - 1 1=1 J=1 n-n. m z (b) k £ a + I 6 X = 0 i = l,...,m i-1 13 j=l *3 n-n, ^ zx (c) I Y.., + k I 5. . = 0 j=l,...,n-n i=l 13 i=l 1 m n-n^ (d) k £ <5PZ + £ 6PX =0 i = l,...,n1 . j=l 13 j=l i3 4) Cobb-Douglas Production (i.e. unitary elasticities of input sub¬ stitution) : zx pz px a.. = 3.. = y.. = 6 =6 =6 =0 V. . . ij U iJ ij ij ij i.J The above are necessary and sufficient. Conditions similar to (1), (2) and (3) are discussed in detail in [75]. The standard procedure would be the following (see, e.g. [79]). 1) Estimate the unrestricted model, form the estimated co- A variance matrix, Œ . u 2) Estimate the model subject to some restrictions and form A the estimated covariance matrix 0. . K 3) If the estimation procedure is a maximum likelihood pro¬ cedure then form the following log likelihood ratio: ifij Q Un l"J where Q is the number of observations. This statistic is B-45 2 asymptotically y distributed with degrees of freedom equal to the number of independent restrictions in R(|*| means determinant). Values of the statistic greater than a pre-set critical value on Type I error means rejection of the hypothesis implicit in the restrictions. B-46 Notes for Chapter 2 1. Vectors are lower case letters, elements of vectors are lower case letters with subscripts, individual scalars (such as aggregate total output) are upper case letters (unless otherwise stated). Sets and matrices will also be upper case letters. All vectors are column vectors unless otherwise noted. A prime on a vector or matrix denotes transpose^ superscripts on vectors are used to refer to different vectors. 2. Weaker properties are possible; see [73]. 3. The level sets are sometimes referred to as input requirement sets'. V(Z) = {x|f(x) > Z) (see, e.g. [33, Ch. I.l], [8l] 4. There are a number of important related concepts in the literature. Let V(z) = {x|T(z,x) < 0}, i.e. x can produce z as represented by T(z,x) < 0. Thus V(z) is the input requirements set analogous to note 3 above. The distance function D(z,x) is D(z,x) = max {X > o|^ x e V(z)} . D(z,x) is the amount by which a vector of inputs must be scaled down so as to just produce z. Thus, efficient production occurs when D(z,x) = 1 and therefore the relation between D(z,x) and T(z,x) is seen to be the following identity [40]: T(z- 5(^0 x) * °- B-47 The concept of a distance function is also used in [73], [33, Ch. 1.1], [ 33» Ch. II.1]; it first appeared in [73]. While the distance function is now becoming a more standard way of representing multiple output/multiple input production, we will continue to employ T(z,x) so as to readily address separability issues (see [40]) > V means gradient, i.e. Vg(x) = (3g/3x^,...,3g/3xn)". Vv means gradient 2 only with respect to the variables in the subscript. V means the Hessian matrix of second derivatives, i.e.: V g(x) = 32g(x)/3x2 32g(x) 73x^X2 3 g(x)/3xn3x1 3 g(x)/3x13xn 32g(x)/3x2 2B Finally the bordered Hessian is denoted V g(x): V2Bg(x) = Vg(x) Vg(x) V2g(x) which is an (n+1) x (n+1) matrix. A slightly weaker condition that would allow making some outputs from other outputs is possible, but of limited interest in the current context. When this is of interest, one could respecify T(z,x) as an explicit pro¬ duction function in terms of one of the outputs (see [33, Ch. II.1]) or define a function in terms of a non-producible input (see [33, Ch. 1.3]). B-48 A _ 7. If u = (u^ uy)' and v = (v^,...,v^)' then u«v = \ is called the dot (or inner) product of u and v. We shall employ the shorthand throughout the report. 8. Here homotheticity is, in fact, a slightly weakened version of the above. See[33, Ch. 1.3], [33, Ch. III.3]. 9. Here again, for convenience, we have taken slightly stronger properties than are in Shephard (see [73, p. 255]). 10. The capital Greek letter pi(II) represents product, just as \ repre- n sents sum. Thus II i = l*2»3...n. i=l ™ 1c 11. This function is H.D.I. By taking a monotonie transformation h(u) = u of f we could have Z = h(f(x)) which would be H.D.k, allowing for increasing or decreasing returns. 12. Jin means logarithm to the base e, (Naperian logarithms). It should be recalled that log(ab) = loga + logb log(a/b) = loga - logb log a*3 = bloga 13. In general, symmetry is enforced for the a_^, i.e. a„ = 14. McFadden [33 £h. 1.1] considers weaker conditions, i.e. some prices that are zero (free factors). A 15. The measure used here is referred to as S in [64]. B-49 «U.S. GOVERNMENT PRINTING OFFICE: 1980 622-006/1075 1-3 REQUEST FOR FEEDBACK TO The DOT Program Of University Research DOT/RSPA/DPB-50/79/31 - "DEVELOPMENT OF HYBRID COST FUNCTIONS FROM ENGINEERING AND STATISTICAL TECHNIQUES: THE CASE OF RAIL" D0T-0S-70061, Northwestern University YES NO □ CD Did you find the report useful for your particular needs? If so, how? D D Did you find the research to be of high quality? □ CD Were the results of the research communicated effectively by this report? CD □ Do you think this report will be valuable to workers in the field of transportation represented by the subject area of the research? CD □ Are there one or more areas of the report which need strengthening? Which areas? □ CD Would you be interested in receiving further reports in this area of research? If so, fill out form on other side. Please furnish in the space below any comments you may have concerning the report. We are particularly interested in further elaboration of the above questions. COMMENTS Thank you for your cooperation. No postage necessary if mailed in the U.S.A. FOLD ON TWO LINES. STAPLE AND MAIL. RESEARCH FEEDBACK Your comments, please ... This booklet was published by the DOT Program of University Research and is intended to serve as a reference source for transportation analysts, planners, and operators. Your comments on the other side of this form will be reviewed by the persons responsible for writing and publishing this material. Feedback Is extremely important in Improving the quality of research results, the transfer of research information, and the communication link between the researcher and the user. FOLD ON TWO LINES, STAPLE AND MAIL. Fold Fold DEPARTMENT OF TRANSPORTATION research and special programs .Administration WASHINGTON D.C. 20590 PORTAGE AND PEER PAID DEPARTMENT OP TRANSPORTATION Official Business PENALTY FOR PRIVATE USE. WOO DOT 513 OFFICE OF UNIVERSITY RESEARCH (DPB-50) Research and Special Programs Administration U.S. Department oT Transportation 400 Seventh Street, S.W. Washington, D.C. 20590 Fold REQUEST FOR INFORMATION FROM THE UNIVERSITY RESEARCH PROGRAM Fold □Check here if you would like to be placed on the mail liât for the University Research Program Solicitation Booklet (DT-63C) IF YOU WISH TO BE ADDED TO THE MAIL LIST FOR FUTURE REPORTS, PLEASE FILL OUT THIS FORM. Name Title . Use Block Letters or Type Department/Office/Room Organization Street Address Citv State. Zip . U.S. DEPARTMENT OF TRANSPORTATION RESEARCH AND SPECIAL PROGRAMS ADMINISTRATION WASHINGTON, D.C. 20590 OFFICIAL BUSINESS PENALTY FOR PRIVATE USE, $300 POSTAGE AND FEES PAID RESEARCH AND SPECIAL PROGRAMS ADMINISTRATION DOT 513