4’ rail?“ .Fxphffifluvql ildiU ' ’llU -'-. .1_~'~‘:A ,'!’ ran-mange (-6“: n. {31.5- . V ”war ’1‘ _. ohm-”Lube: «ma; @431in- 7‘ Research Method and Procedure i—lll’l’l 5' o o ' ' ' m Agncultural Economlcs. - ‘5 I _' ‘L.’ l [’4 /. _. . I, .395“ A Publication of the Adviso Committee ,2 ry ,5 . ., i—H, . on 3M . I! l/ ,1 4- ':\"~' '. . . ,1} ‘ 1 r Economlcs and Socxal Research in Agriculture of The‘kaocial Science Research Council VOLUME ONE AUGUST 1928 \— ; m mterial in this volume Was preparad under the direction of the Advisory Committee on Economic and Social Research in Agriculnuro of the Social Science Research Council. The Constituent organizations of the Social Science Research Council are as follows: American Economic Association Altar-loan PolitiCal Science Association 41.1118:ij Sociological Sociocy American Statistical Association _ American Psychological Ass: :cs ration American Anthronological Ins sociation ‘ American Historical Association The members of the Advisory Oomdttee on Social and Economic Research in Agriculture are: H. C. Taylor, Northwestern University, Chairman John D, Black; Harvard {Taiwan-icy Joseph S. Davis, Food Research Instituto C. J . Galpin, Bureau of Aggioulfiurul mommies L. 0. Gray, Bureau of Angcfitulel Ecommids E. G. Hearse, Institute of Economics G. 3'. Warren, Cornell University kmm'hém .ef‘ Maw IWsow Gomfioa on Ecoziomioa and Social Research in Agriculture of a. Thaw wince Roamc’h 0mm v us» 1;... T.):~‘.‘a.2 3;... A“?! u; m w\ -“ mm 2.... i '- )I‘AH L“ ’ ,3 b: ,. .> (I ‘ , _ V." if] \ lf-(Z’ The special sub-committee which presents this body of material of research methods in agricultural economics is supremely conscious of its shortcomings. Except for two months devoted to it by the chairman of the sub-committee, it has all been contributed out of the busy lives of those whose names you will find attached to the various sections of it, and a considerable list of others who have helped in ways less easy to acknowledge. There are important gaps in it. In some eases these hays resulted because those assigned certain sections were unable to finish them in time. Smaller omissions have resulted because the mem- bers of the sub-committee did not think of them early enough. A good many other gaps have been filled by the members of the sub—committee. PREFACE Limitations of space hare compelled almost ruthless conden- sation of many of the contributions. For example, the section on "The Measurement of Time Series Movements" was condensed from 150 pages of manuscript submitted. The equivalent of 120.pages of mimeographed material and a score of charts were deleted in e second cutting in order to keep within the publication budget. ” The practice is followed of creditingféhchycontributor in the text or in a footnote. This does not mean that he is altogether res~ ponsible for what follows, beCause the subécommittee has made occasional changes in the interest of consistency. Nor does the sub-committee wish to be held responsible for all of it, since it has in places ac- cepted statements by contributors which it would not accept as its own. The sections which are not credited to any one person are mostly the work of the cheirman. If this publication was the product of the sub—committee, its members would wish to express their appreciation of the splendid as— sistance which they have had from a large group of co—workers. Inas— much as it has been a cooperative venture, a more appropriate statement is in terms of the pleasure and benefit which they have derived from the joint effort. _ John D. Black, Chairman ' L. C. Gray _E. G. Nourse H. R. Tolley kugust 13, 1928. "O V ,\)r‘~4 TABLE OF CONTENTS Introduction, p l. * What Is Research? p 3. What Is Science? p 6. Scientific Method. p 9. Qualitative vs. Quantitative. p 11. Inductive vs. Deductivc. p 15. The Scientific Methods. p 15. Social Science Research. p 18. Economics vs. Social Science Research. p 19. Part I Research Procedure. p 21. Choice of Project. p 21. Planning the Project. p 26. Execution of the Project. p 82. Part II Methods. p 36 A. Statistical Method. p 36. (a) What 18 Statistical Method? p 36 (b) Units and Measures - Definition of Terms. p 38. (c) Errors in Data. p 47. . ((1) Sampling. 1) 49. (e) Soucces of Ddhe- and methods of Securing It. p 58. (1) The Survey Method. p 58. (2) Dita of PriVate and Public Records. p 91. (3) Supervised Records. p 104. (4) Mailed Questionnaires. p 113. (5) Date from Reporters. p 127. (6) Secondary Data. p 138. (f) Preparing Data for Analysis. p 140. (1) Editing. p 140. (2) Checking for Accuracy. p 144. (3) Tabulating; p 144. (g) Analysis of Data. p 146. (1) Summarizing. p 146. Frequency Distribution. p 148. Averages. p 152. ‘ Diapersions. p 156. (2) Measurement of Time—Series Movements. p 159. Types of Movements. p 159 Secular Trend. p 162. Seasonal Variations. p 182. CycliCal Movements. p 188. Periodicity. p 189. Lag. p 193. RESEARCH METHOD ”sweetness L11 AGRICULTURAL scone-Mics- ‘ INTRODUCTION _ y The objective.of the Committee on Social and Economic Research in Agriculture in preparing this volume is to summarize as much as pos-' sible of the eXperience of agricultural economists in planning, organ- izing and executing research in this field.’ This handbook is not in- tended to be a textbook on research method. Formal research method as such is much more completely developed in special treatises devoted to it. But the application of formal method to any problem brings to light numerous details and special considerations. It is these latter which constitute the material of this handbook. ‘ - The need for research in agricultural eeenbmics needs little emphasis here. 'From the beginning, teachers in this field have felt keenly the need of more facts as to the economic aspects of farming and marketing activities, and there has been a growing realization of the need for understanding the.relationships involved. Economic facts are transitory. Students who are taught agriCultural economics in terms of specific rule—of-thumb practices or economic set—ups may find them- selves out-of-date in a few years. What they need is training in econ- omic analysis and in the.use of data, especially the data of changing ,economicrconditions. ' ‘ ., ' “ ' That extension workers have come to realize the importance of . economic research needs no ampler teetimony than that which‘follows, - taken from a report made on April ll at the Joint Meeting of Directors ”of Extension with officials of the Bureau of Agricultural Economics and the Extension Service of the U; S. Departmentjof Agriculturezl "There is need for a strong research program in economics in the respective states. ‘ "The Bureau of Agricultural Economics of the Department of Agriculture can render valuable ser- vice by studying, upon request, the research pro- grams and teaching curricula in the'respeetive states and assisting in coordinating and broadening the work in this field. "Inasmuch as the last session of the Land Grant College Association was devoted to a consideration of the agricultural problem, its causes and remedies, - it is recommended that the seSsibn this year deal . with strengthening the research, teaching-and exten- sion progrmns in the economiC-phases of agriculture, especially as referred to in this report." . The recent protracted struggle for farm relief legislation has thus far had no more positive result than making the leaders in this movementrj ' and in national affairs realize that better and more'complete data ad- equately analyzed is basic to any program of orderly production or order- ly marketing, and to the success of any administrative device set up'to help agriculture with its manifold problems of production, land-utilise ation, marketing, credit, insurance, transportation and taxation. The .passage of the Purnell .Act is itself testimony to this fifi§0§.- The ninth recommendation of the Business Men‘s Commissibn on Agriculture is ,as follows: .- ‘7“ ' ' "The Commission strongly urges the extension‘ and coordination of research work in the field of agriculture by the Federal and state governments and other agencies and the appropriation of larger funds for such work. Extensive research is needed. to supply the basis of a comprehensive land'util- ization policy, for the elimination of plant pests and diseases, for the development of new types of agricultural products and of new uses for existing products, as well as concerning the possibilities of the application of industrial methods and busi- ness organization to agriculture, all of which agriculture itself is far less well equipped to further than are other fields of economic activity. LL: ) Finally, the Budget_Director of the U. S. Government has come to the. point of including in his category of "necessary expenditures" some ad- ditional appropriations for the U. S. Bureau of Agricultural Economics. With all this growing emphasis upon research, no argument should be-' necessary as to the impertance of research method. But the whble weight of circumstances is against giving sufficient attention to methodology. 7 'The administrative authorities want immediate results. The research . . workers themselves look anxiously forward to specific results. iii the‘ results accord with expectations, the method is likely to be accepted as.sound without further testing. New projects outlined take orer ' methods already in use without seriously questioning their validity. The Vmadority of economists and even of research workers themselves are ins patient with discussions of research. "The problem suggests the appro- priate method", is their answer to questions asked on this subject; or "Good research workers are born not trained"; or "Good sense is a suf- ficient guide". ‘ 3 But many signs point to a change of attitude toward research method. 'Directors of state experiment stations have recently become more critical of research methods in agricultural economics. They have too often seen .the old methods produce doubtful or inconsequential conclusions.l As long as research in agricultural economics was an insignificant part of their program, they left it largely in the hands of their economists. Now that the passage of the Purnell Act has made it the new direction of growth_of their activities, many of them have for the.time given the major part of their thinking to it. These men were mostly trained in the natural sciences. They bring to the analysis of research methods 3 in economics the point of view and the standards of the natural, sciences; and their reactions have constituted a challenge to research in this field which has been and which will continue to be of tremendous value to it. No less valuable is the service.of similar nature rendered by Dr. Allen of the Office of Experiment Stations. His analysis of the new research projects submitted under the Purnell Act has been a strong influence in the right direction. The economists themselves-have had more critics of research methods withing their own ranks. The result has been that any method proposed for a problem is likely to be challenged today; under these circumstances the preporent of it feels called upon to defend it. This , starts him to thinking vigorously about it. In this way, routine procedures in‘research have been brOken down and an era of growth has set in. Projects are being set up increasingly in which development of method is either the sole or an important objective. A contribution to method, it is coming to be realized, is likely to have many times the usefulness of a contribution to the analysis of a problem, particularly of a problem in a given area. Many of the younger workers are bringing fresh points of view to the depart- ments to which they come. - Entirely aside from questions of research method themselves are those relating to the organization and administration of research. 'Certain large aspects of the organization of research.in agricultural economics have recently become important, such as the relationships between the research work of public agencies and institutions of research, between that of publicly supported and privagely endowed institutions, that of state experiment stations and the U. S. Department of Agriculture, and that of state experiment stations and state departments of agriculture _‘ and markets. All this points to.the need of the coordination of re- search in agricultural economics into something approximating a national program. The final section of this handbook will be devoted to matters of the latter sort. ' - ‘ What i_ At the outset, we shall do well to have as clear' Research? a notion as possible as to what research is and the nature of scientific method. It is doubtful if we can set up final criterion by which we can definitely class- -, ify certain things as research and others as not. All will agree that more fact-finding, or mere census taking, or mere making of records, is not research. But clearly the securing of accurate data or records as in a survey is part of the procedure of research in the social sciences. It is this whether in the organization of research one person or agency gathers the data or source material and also analyzes the data, or under a system of division of labor, one party or agency collects the data and another analyzes it. To be sure, a project which has for its sole ob- ject and final and the assembling of the data cannot of itself be called a research project; but if it can.properly be said that the data have been collected in such a way, and in the expectation, that they will presently be analyzed by others, or used in related research projects, then such a project can surely be considered a part of a research program. It is per- tinent to remark, however, that when the fact-gathering and analysis are not definitely tied together in one project, or are done by different agencies far removed from each other, then fact-gathering is in great danger of being mechanical or routine. Analysis itself is of different degrees of research value. The mere computing of averages, for example, has little research signifi- cance. Analysis which looks to more description of an area, or group of marketing agencies, or an historical period, is frequently research of a low order if at all. If the description-is so developed-as to show relationships between variations as well as the variations them- selves, or similarly, relationships between the different changes occur- ing during a period, then it becomes research_of a much higher order. If in addition, these relationships are so analyzed that they can be used as a basis for inferences of more general application, the research value of the research project is still more apparent. ‘- - A more.difficult question is whether an analysis that looks to pro- viding facts and conclusions needed for a program of amelioration is pneperly called research - let us sa , a study of farm organizations in an‘area looking forward to an extension program based on the conclusions reached. Undoubtedly such projects stand as research in the minds of some experiment station directors. They have as much right to be classed as research as projects developing methods of control cf insect posts or rations for fattening steers. But the devotee of "pure" science is like- ly to look askance at such ventures and ask if there are no general prin- ciples relating to combinations of enterprises the understanding of which cannot be advanced a little by such a project. she-expects an experiment in feeding to reveal some relationships of general or more far reaching nature between amounts and proportions of different feeds and gains in weight. There is undoubted validity in this point of view; but it may be carried altogether too far. Results of most economic analysis in general have a more local and more temporary application than those-in the applied natural sciences. This must not be taken as evidence against their essential research nature. It is the inevitable consequence of the essential characteristic of social science research, namely, many inter- related factors affecting the result. Always a good many of these factors are localized or placed in time. This does not mean that there are no factors that extend over large areas, or continue from period to period; but rather that the task of isolating and measuring them is made more difficult. Indeed, the whole task of research is made the more difficult, especially that part of it involving inference. ' On the other hand, a project designed to serve a purely local or a purely temporary need, without contributing anything of more general or permanent value, especially the latter, ordinarily can have little scien- tific value., To call it research would be somewhat akin to calling re- search the task of the social worker in locating a starving family in order that it may be properly fed, or locating a supply of apples to overcome a temporary shortage in a market. ‘ " d The point of view of the scientist himself on this question has been as well expressed by Karl Pearson as anybody: "I want the reader to appreciate clearly that science justifies itself in its methods, quite a- art from any serviceable knowledge it may convey. We are too apt to forget that this purely eduoa. tienal side of science in the great value of its practical applications. we see too often the plea «usraisedrfor science that it is y§3§21LEQQEl§§gi. hiwhile philology a“- L ,‘and philosoyhy are _,‘ supposed to have small utilitarian or commercial value. Science, indeed, often teaches us facts of primary importance for practical life; yet not on , this account, but because it leads us tp classifi- jcations and systems independent of the individual thinker, to consequences and laws admitting play; ' ‘room for individual fancy, must we rate the train— . ing of science and its social vabie higher than those of philology and philosophy. Herein lies the first, but of course not the sole, ground for ,the popularization‘of science. That form of pop- ular science which merely recites the results of ”investigations, which merely Communicates useful knowledge, is from this standpoint bad science, ‘ or no science'at all."*' ' ‘ ' The characteristic of economic research which the natural scientists have most,difficulty in evaluating is its dynamic aspect. They are likné‘ 1y to say of a project that if the facts and relationships it discloses . are not likely to be true a few years from now, why discover them? This ' Would be a sufficient answer, looking at the matter from a purely research point of view, if such facts and relationships were to be obtained only * once and at this time. But if this is the beginning of a series of such analyses made annually or at other stated intervals, looking to the ob« servation of trends and discovering of relationships between variables, , in time sequence, then the answer would be altogether wrong. The-close ‘ analysis of year—to-year and month—tOumonth changes in potato prices from 393 pp is research'of the same order as analysis of the factors affecting the price of potatoes 13 ygags past. hesearch.basic to outlook ‘ analysis can be research of a high order if it is.planned so as to pro- vide the basis for a continuing analysis of economic changes. Economic analysis of any problem is never done. Setting up adequate machineny ' for collecting data of current changes, as of land values or shifts in crops, so as to make possible continued testing of hypotheses, may pro-- 3 perly be considered a part of the research project itself the same as the collecting of survey data. Of course the purpdse for which the data are collected is important. If it is merely to enable someone to pre- dict.the size of the coming lamb crop or its price, end'to tell the fare mer that to do under the circumstances, then the collecting of data, is a part of service work and not research. ' ' ‘It is in the sphere of dynamics that economic science has been most slow to develop. The laws and principles of economic change offer a rim» n.’ J. ' .field for the social scientist. The close study of current phenomena 6 probably has more to contribute to progreSs along this line than the am- alysis of the records of the past. DeveiOping the technique.of obser- vation.and measurement of current changes is one of the important rese- arch needs of the day. ,> ' I ‘ All it. * Grammar of science, p. 9; We shall have a better comprehension of what‘fl :fiha§.;§ constitutes research if we consider for a moment Science? the nature of science itself and scientific method. ' Karl Pearson says in his Grammar of geigngg that "all knowledge is concise description". He is here thinking of the knowledge which constitutes science, not of the isolated fact. Scientific knowledge consists of statements, in.unmistakable lan— guage, of the uniformities and regularities in the phenomena of the universe. Once such a statement of a uniformity or regularity is made. it largely saves the necessity of special examination of the pertinent '/ aspects of individual cases that arise. It becomes possible to reason from these general statenents to statements as to the particular case. This is the economy of science which has made human progress possible. - "In the struggle for existence," says Pearson, "man has won his dicta- torship over other forms of life by his power of forseeing the effects *which flow from antecedent causes - not only by his memory of percep- tions— --------- science is an intellectual resume of past experience, and mental balancing of the probability of future eangierience.“I One of the principal reasons that only probabilities can be stated is that the new causes are never the same as the old causes. To quote from Pearson further: "Everything in the universe occurs but once; there is no absolute sameness of repetition. Individual phenomena can only be classified, and our problem turns on how far a group or class of like, but not . absolutely same, things which we term "causes", ‘ will be accompanied by or followed by another group or class of like, but not absolutely sane, things which we term "effects" ------ "If the "causes have .such and such a degree of likeness, how like will the "effects" be?“** "Existences are individual; it is a human, a rational process which for economy of thought clas— sifies them. Any variation within the existences in one class is bound to be associated with a cor- responding'variation among the existences in a second class. Science has to measure the degree of stringency,.or of looseness, in these concomitant variations. Absolute independence is the concep-. tual limit at one end to the loosenese of the link; absolute dependence is the conceptual limit at the other end to the stringency of the link. The old View of cause and effect tried to subsume the univ— :erse under these two conceptual limits to experie ence -- and it could only fail: things are not in our experience either independent or causative. All classes of phenomena are linked together, and v 0p”; oit., p. 113. t* "g " " 157. While there is no such a thing as absolute causation in the universe.- the problem in each case is how close is the degree of association. Likeness of Causes produces like- ness of effects; we can measure the degree of like- ness, whether we are dealing with a chemical react- ion or with the resemblance in any aptitude between parent and child. There is no question of absolute ' sameness in either case; there is a wide degree of difference in the likeness, but both problems are only variants of one and the same logical problem”- 'the contingency problem at the basis of modern science.* "No phenomena are causal; all phenomena are contingent, and the problem before us is to measure the ‘. degree of this contingency, which we have seen lies between the zero of independence and the unity of causation. That is briefly the wider out- look we must now take of the universe as we exper- ience it."""il - .but only probability, nevertheless there is such a large amount of uni- formity and regularity that it becomes possible to build.up a large structure of laws and principles and thus introduce some very great economies of intellectual effort in human affairs. economies of this sort prove to be false economy. Some attempts at The statements of regularities or uniformities first take rank as hypotheses or theories. Eventually-many are accorded the ranking of laws or principles. There is no infallible guide as to when a proposition is entitled to be gen- erally accepted. The test of validity is acceptance by the great pre- ponderence of "normally constituted and duly instructed minds."*** There can be no test outside of the human mind. While there is no necessity in the world of experience - even the sun may conceivably not riSe tomorrow - there is in the world of logic . and concepts, if we accept the canons of Aristotelian logic. If we ‘ state our assumptions and premises in a certain way, or set up certain' concepts, there is no escaping the conclusions. But the conclusions are only as true as the premises or the concepts.' Sound logic is as fundamental a part of good science as are accurate data and facts. "The word science", says Pearson, "applies to all rea~ soning about facts which proceeds from their accurate classification, to the appreciation of their relationship and sequence."****v' It will be clear from the foregoing that classification is a highly -important part of science. It is a necessary condition to the attainment of that~conciseness of description which is the essence of science. is necessary to any comparing of causes as a basis for forecasting effects. It t Op. cit., p. '155 m- 0p. cit., 13.3314 '. is n n n 174 '**m* u u / n-~~ ~. _ -~‘s. The common statement that "science is classified knowledge" will not suffice; “however, unless understood to include also the laws that follow from this classification. ~ ; Karl Pearson's definition of classification isimportant-here: "The reader must be careful to recollect that classificatign is not identical with collection. It denotes the systematic association of kindred facts, the collection, not of all, but of relevant and crucial facts;"; The statement that a scientific law is merely'a description may be easily misunderstood to mean that science is only empirical in the usual sense of that term; that is, as consisting merely of simple statements of association between things. The analytical method, which is the true meth— od of science, is not satisfied with simple statement of association, but attempts to get back of this to other associations and back of this to §t111 other associations. To illustrate, an empirical analysis of elastic- ~ -ity of supply of milk would stop with discovering what factors produce change in the supply of milk, and perhaps measuring- the amount and time ~ of the response. A scientific analysis would reveal the relationships on the dairy farms which led to these responses. and their amount and time.» The analysis might be pushed still farther into the social psychology_ of ' the group of farmers and other more ultimate factors. But as Pearson so well remarks, scienCe has no first cause. A cause. as understood in this connection, is one stage in the routine ”of experience. Back of this stage, are others which science is not now able to compr(hend Pearson observed that the problem of the causes of the growth of a particular ash-tree in his garden could be widened out into a description of all the past stages of the universe. A scientific analysis is just a few links in a long chain.~ of cause and effect. The essential difference between empirical and analy- tical is that the one accepts superficial relationship without inquiring as to antecedents, whereas the latter pursues antecedents a stage or two at the least. The importance of the analytical method is that it enables us to forecast with greater assurance. If we know what lies behind a given set of relationships, we are in a much better way of knowing whether they will occur again. A study of the factors affecting the yield of cotton dis- closed an apparent association between winter weather conditions and yield in the subsequent season. What lay behind this was not detirmined. When a longer series of data became'available, the apparent aesociation of the period appeared to be largely chance. If a reason had been established for this association of winter weather and yields. the association would not have proved to be chance. and a forecast of yields could have been made.. with same assurance even on the basis of data for a relatively shorter per 0 Professor E. B. Wilson of the Harvard Medical School, in his paper on the use of quantitative method in economics before the American Econ- omic Association in December 1927, wrote as follows: "Even if progress be slow, the very effort _at rationalization is itself a stimulus, and abandoning the effort leads to a decay or relaxation of science ‘01) cit p.77 . -‘ even in its empirical parts. You are doubtless a; were that around 300 3.0. great beginnings were made in medical science under Herophilas and Era» istratus, who avoided all dogmatism, reliinggon; careful observation and analysis, but that aftere wards came the school called Empiric which con- fined itself to purely descriptive work and pro— hibited the inquiry into the general causes of things. Progress stopped.* "It goes without saying that both (analytic-' a1 and empirical) methods may be mutually helpful and that both are employed together, albeit with differing emphasis according to the nature of the phenomenon or the individual proclivities of the investigator. The aim of the empirical nethod is _to enable us to recognize a situation in such a . manner that we may distinguish between different situations. This is basic to knowledge; we cannot analyze and reason until we haye some more or less definite information and some fairly well discrim- inated notions which we may analyze and upon which we may reason. The aim of the rational method is to any crgg, to prove, to forecast with greater assurance."** We can now discuss effectively the essentials of .Scientific scientific method. A few statements from Pearson‘s Method Grammar 9: Science contains most of what we need: "As to the scientific method, we saw in our first chapter that it consists in the careful and often laborious classification of facts, in the H comparison of their relationships amd sequences, and finally in the discovery by aid of the disci- plined imagination of a brief statement or formula, which in a few words resumes a wide range of facts. Such a formula, we have seen, is termed a scientific law. The object served by the discovery of such laws is the economy of thought; the suitable assoc- iation of cones tions drawn from stored sense~ impressions, pegmits the fitting exertion to follow with the minimum-of thought upon the receipt of an immediate sense—impression. The knowledge of scientific law enables us to replace or supplement mechanical association, or instinct, by mental as- sociation, or thought. It is the forethought, by aid of which man in a far higher degree than other animals is able to make the fitting exertion on the receipt of a novel group of sense-impressions."*** *gg 0p. cit. p. 2- n.’ I: II n ‘1 *t* n I ‘II'77 10 -“ .114 "The classification of facts and the formation of absolute judgments upon the basis of this class- ification-- judgments independent of the idiosyncrasies of the individual mind-~essentially sum up the gap and method of modern science. The scientific man has above all things to strive at self-elimination in his judgments, to provide an argument which is as true for each individual mind as for his own. The classification 21_fa¢ts. the recognition of their sequence and relative significance is the function of science, and the habit of fonming a judgment upon these facts unbiassed by personal feeling is character- istic of what may be termed the scientific frame of mind."' "The scientific method is marked by the following features:--(a) Gareful and accurate classification of facts and observation of their correlation and sequence; (b) the discovery of scientific laws by aid of the creative imigina- tion; (c) Self-criticism and the final touchstone of equal validity for all normally constituted minds."“ Also the following from a paper by J. C. Cobb entitled "The Significance and Use of Data in the Social Sciences" in the Economic Juurnal of England, March 1928: "The scientific method assumes a clearly stated problem, analyzed to make sure that it is a simple prob- lem and not several problems combined. It requires a careful study of factors bearing on the problem and quantitative statement of each. A problem so stated is in form to have logical reasoning produce a solu- tion. The reasoning must of necessity always remain dependent on the clearness and soundness of the reason- ing mind, but if the problem is fully and simply stated it is comparatively easy to follow and check up the logic of the reasoning. The great difficulty in the scientific method is the creation and verification of quantitrtive data."*** "It may therefore be stated that the purpose of scientific investigation in social science is the production of data which logical reasoning can apply to and use in the conduct of human affairs."*"’ From the foregoing the following can be summarized as the essentials of good scientific method: 1: Careful logical analysis of the problem, isolating it from other problems and separating out its elements. This means, in some cases, the formulation of an hypothesis, in the proper meaning of this expression - a trial hypothesis that will point the investigation. ’0‘ 013. 0113.13. 76. Economic Journal of England, Ibrph 1928, p. 73. II I! 0' N W 68 . ‘-.V. 11 2; Unequivocal definition of terms and concepts, and statis- tical units and measures, so that others Will under~ stand exactly, and be able to repeat the analysis and test the generalizations. I 3: Collection of'cases and data pertinent to the subject in hand. I M 4 . 4: GlasSification of cases and phenomena and data of same. 5: Expression of factors in quantitative tenns whenever pos- sible. ‘ ‘ ' 6: Rigorous and exacting experimental or statistical proce- dure in summarizing the data and in isolating the at- tributes or variables and measuring their relationships and'inter-effects. 7: Statement in unassailable terms of the exact conclusions that are warranted for the cases examined. 8: Sound logical reasoning as to the bearing of these con- ‘ clusions on the trial hypothesis and in the formula— tion of generalizations, 9: Statement of cbnclusions or generalizations definitely and clearly so that others will be able to check thenn 10: Complete elimination of the personal equation. 11: Complete and careful reporting of the data and the methods of analysis so that others can check the analysis, or test the generalizations with new sets of data. Perhaps it is well at this point to repeat that the unassailable conclusions referred to are not statements in terms of absolute cause and effect, but of degrees of contingency and probability. The con— clusion may even be in terms of negative results, or of a small de- gree of probable contingency; ~What is wanted is statements that are amply supported by the data and analysis. ' " Quantitative gs ‘ In the foregoing list of essentials of good .anlitative scientific method, the demand appears for both quantitative and.qualitative analysis, perhaps in about eQual proportions. There has been consider- able discussion recently as to the relative import- ance of these two types of analyses(x). _Most of such discussion is futile. Both methods are'necessary. In some fields, only qualitative (x) See Proceedings of the American Economic Association 1925, for Professor Mitchell's preSidential address, and the proceedings of‘; the two following years for round tables on methodology. - .- ‘“enalysis has thus far been possible. No one denies that-analysis should, be'reduced to quantitative terms whenever possible. Most important of all to realize is that even When quantitative data are available, a large amount of qualitative analysis is needed. It is needed at the very start in isolating the problem and separating out its elements. _ in stating the variables, in defining the units, concepts and meas- ures, and in selecting thedata to be collected. It is even more ‘\essential in formulating the conclusions 1 Mr. J. C. Cobb, in the article previously referred to dis- tinguishes between the terms qualitative and quantitative as applied to data in the following language: "Qualitative data are factors, elements or con- in ditions observed or ascertained to exist in phen— omena, the relative value or comparative importance of which have not been measurably stated and de- fined. "Quantitative data are such factors, elements or conditions which have been subjected to inten- sive analysis and their relative or comparative force or effect on a phenomenon defined, measured and stated. _ "It will be noted that under these definitions qualitative data are necessarily the precursors of quantitative data, as the existence of a factor must be observed or ascertained before its force and effect can be measurably considered." (X) It follows from the foregoing that qualitative analysis is analysis in tenns solely of the factors or attributes that enter into an association, whereas quantitative analysis measures the amounts of these factors or attributes and the degree to which they are associat- ed with each other and with other factors or attributes called "effects", and the rate at which these change in relation to each other - how much of a variation of one is associated with a given amount of varia- tion of the other. Mr. Cobb makes use of the.following illustration:(xx) "In an economic study of the relation between the interest rates on short term notes and the price of longyterm bonds it was observed that bond prices seemed Commonly to follow the note interest rate, but only after an interval of time. This interval was ascribed to the psychological fact that the bond investor acts more slowly than the note buyer and takes more time for consideration. This fact was noted and the conclusion that bond prices followed note interest rates was modified by the G) Economic Journal of England, March 1928, p. '72. (xx) Economic Journal of England, March 1928, p. 71. 13 P N'" , l statement.thux“there was an interval of delay«duo~tu' ‘“‘\ psychological "lag". In this case the psychological lag was used aS“n,dntum'in the consideration of the problem and was a qualitative datum because it was not measurably or comparatively defined. Later, in- tensive observations Were made as to the period of time and a curve was put into the problem indicating how soon after a change in note interest rates.the bond market would move. The minute this curve Was put down on paper the psychological lag ceased to be a qualitative datum and became a quantitative datum. The quantitative nature of a curve is not dependent on its correctness or the conclusiveness of the investigation which produced it. All that is necessary to make it quantitative is that it 7 shall be so stated and defined that its relation ~ to the problem can be analysed and tested by some standard and confizred or'corrected. The ascer- tainment of the lag described is a measurement by intensive observation of the psychologiCal average action of many minds considering two different types of transaction_ rem different standpoints. It is nevertheless susceptible of conclusive determination as indicated by the progress made by the economists. The distinction betveen qualitative and quantitative data is not a difference in the nature of the data. it is a difference in the method of statement." Some writers on method imply by their statements that quan4 titative analysis is more "scientific" than qualitative analysis._ A better statement is that these condition and complement each other. Quantitative analysis alone is virtuallv helpless, Qualitative an; alysis alone is inadequate and unsafe in ordinary use. Considered as a sole method, the qualitative has the important limitation that it is difficult to state the terms cf it and the conclusions with sufficient definiteness so that only one understanding results. The result is a vast amount of dispute and disagreement occasioned by nothing more than unconscious differences in definition of terms or interpretation of the language of the generalizations. Another . limitation is that important factors in the problem.may be overlooked; and there is no easy way of matching the conclusions with experience to show that they have been overlooked. Given qualitative analysis of a high order, with rigorous definition of terms and statement and testing of conclusions, and results of very great value follow. But qualitative analysis of this high order is far from common. It is practiced only by the best minds. Any one can try his hand at qual— itative analysis. Nearly everybody does. People indulge in it most freely in the fields in which they are least informed. Consequently for the ordinary run of research workers, it is highly necessary that' data be collected and qialitative conclusions checked at every turn. Only after quantitative analysis is there any safety in the conclusions with the ordinary research worker; and even the most gifted are oc— casionally blind to inmortant elements in problems. a 14 The difficulty of recording qualitatiwe data definitely and 'of reporting the steps in qualitative analysis, also makes it a less ~‘secure structure to which to add later. Checking and testing of con- clusions is more difficult and uncertain. The principal inadequacy of qualitative analysis arises from its failure to weigh and evaluate the different elements in a problem. ‘Forecasting requires measuring the quantitative relation between causes and effects. There is no such clear—cut division between qualitative and Quantitative as the foregoing would suggest. Much of what paeses for qualitative is informally quantitative; that is, at least some rough approximations are made of the significance of the different factors. ~ihere are minds so nearly qualitative that all factors seem to be ‘weighted nearly alike in their reasoning processes - but they are indeed exceptional. The normal mind at least goes so far as to rank the causal elements as to their relative importance. Such analysis may not be numerical; but it is roughly quantitative. Even in the matter of data, the distinction is not élear-cut. Are grades of students qualitative or quantitative? Can shades of green be re- duced tO'a quantitatve basis? Types of soil? , An important question in this connection is as to the de- sirability of division of labor between qualitative and quantitative analysis. Shall we have certain persons of superior logical ability concentrate on qualitative analysis to the exclusion of all quantita— tive considerations; and let the more ordinary workers plod through the routine of collecting and analyzing data? Or shall we insist that none shall work without quantitative data? The first condition is the one which comes nearest to prevailing at present; and constant pleas are being made that we change to the second. Clearly there are advantages and disadvantages in both arrangements. ”Most careful students of the history of science will agree that it will be well to continue to let a few of the best minds largely concentrate on qualitative analysis. Specialization will be advantageous here as in most fields of endeavor. No doubt, however, it would be well if even these kept more closely in touch with quantitative developments than some of them have done in the past. Moreover, for the usual type of\research workers, no doubt the best program is one which combines the two types of analysis. Though much criticism has been leveled at the largely qualitative analysts in the past few years, they are far from being as deserving of it as the much larger group of workers who have used quantitative methods without a proper qualitative grasp of their 'problems. It may be proper to have workers of this sort collect data and go through the mechanics of analyzing it - laboratory research has-its horde of common laborers too; but somewhere in the organization of thb project, if it is going to contribute to progress, there must be a directing mind which has thought it through in qualitative terms. The greatest weakness of present research in agricultural economics is that a sufficiently competent directing mind of this sort has too fre— quently been lacking. a.Indngtive'gg It is true that science is more obviously ' ductive inductive than it is deductive in its logic. The .procedure is more conspicuously from a number-of individual cases to a generalization. Professor W. C. Mitchell, however, has called attention to the blunder of identifying induction with the quantitative method. "In all thinking", he says, "It is necessary to pass from the confused data yielded by observation, and back pg particular facts. ‘(The under- lining is ours). In this respect the method of R'cardo and the method of the modern.quantitative worker are identical."%x) It would have been even more important tO'haye pointed out that the first analysis of‘a problem, that which isolates it and separates out its elements pend defines terms and classifies, is much more largely deductive than inductive. Its results are obtained by applying principles already formulated to the problem in hand, by passing from premise to con- ‘ clusion by those rigorous processes which Pearson has described as involving "logical necessity." The going "back to particular facts" that Mitchell speaks of represents the testing of the generalizations. This may be done as'a part of the piece of analysis in question, as' when forecasting equations are applied to the individual cases and the residuals examined; but it is more conspicuous in the continous testing that generalizations receive from the moment that they are made, by workers'over the whole field. ' The Scientific ' In broad outline, scientific method is the Methods same in all fields. "The scientific method is 'one and the same in all of its branches," says Pearson, " and that method is the method of logically trained minds."(xX) "The unity of all science consists alone in its method, not in its material. The man who classifies facts of any kind whatever, who sees their mutual relation and describes their seQuences, is apply- ing the scientific method and is a man of science."(xxx) When these processes of classifying facts and discovering their mutual relations and sequences are observed more closely, how- ever, it will be found that they take different fonns, which are commonly designated as different "methods." 1. Analogy.’ The simplest and most primitive of these is the method of analogy, in which inference is made from an example or two as to other cases or situa- tions. A bad business depression accompanied a-Democratic'administrap tion in the 90's; it would accompany another one in 1929.. The great weakness of this method is that the cases are not likely to be suffic- iently parallel. (x) Proceedings of the Thirty—ninth Annual Meeting of the American .Economic Association (1926) p.20 (xx) obt‘Citog P. 10. (xxx) II II P0 12. is 2. Case ' When the number of cases than Can be used Method. as parallels increases a little, the pro- cedure is likely to be called the case method. In primitive use, the cases used have generally been the striking cases. Much of our ordinary thinking about business cycles even today is based on experience in a few of the more climacteric of them. History of earlier times is largely in terms of outstanding or abnormal events. In modern use, the case method has come to mean a thorough internal study of a limited number of cases - as one might study in detail the affairs of a few mail order_houses, or farms, or communities. ' 3. Informal Given still more cases, a mental procedure . Statistical more nearly approaching the statistical is .13 followed. Professor L. L. Bernard in an article on "The Development of Methods in_ Sociology" has this to say of the "Informal statistical method":(x "Informal statistical method involves the rough generalization which is made by approxima— tion from facts, possibly poorly defined and attested, which are picked up in more or less random observation. In this way the man in the street makes up his philosophy of all types of events from the weather to the presidential election. He "guesses" at general tendencies from what he sees or thinks he sees. The sample he uses may not be representative and his method of drawing conclusions may be anything but rigs orously exact. However, it was the only method of seeing many separate events together as a unitary or single process, that is, of-getting perspective and unity into one‘s view of social life, before rigorous or mathematically exact statistical method was invented. In spite of the fact that it was the forerunner of math- ematical statistical method and was universally used for many centuries and still is used for tentative generalizations, and by the masses almost exclusively in their thinking, it is not ordinarily regarded as a stistical method at all. But such it is, and the only distinction between informal and formal statistics is that the former is merely more or less accurate approximation, usually of a fairly simple sort, while the latter is a process of complex calculation of general principles and processes, of averages and means, on an exact mathematical basis. "Informal statistical generalization is older than formal sicence, even then recorded culture. It is the unconscious inductive source of many of (a) G.U. Yule, "An Introduction to the Theory of Statistics", 1922 edition, p. 10. (Charles Griffin and 00., London). 17 the ancient maJor premises which the early phil- osophers, such as Socrates and Aristotle, sought to test by means of a more or less rigorous logic." \‘4.; The Statistical Statistical method becomes formal when enough “ Method. cases are included and measured in a definite .. way to make possible the use of at least ele- - ' mentary mathematical processes. These cases . ' . need to be sufficiently alike so that they possess'the same general order of attributes and common measures can be applied to them. Statistical data. says Yule. "are quantitative data affected to a marked degree by a multiplicity of causes." (I). The problem of statistical method is to take data of this sort and determine the relations between variations in them, and to make inferences front these for a larger universe. The "Multiplicity of the causes" makes it nece3sary to isolate the different causes and their effects.. Statis— tical'method does this by taking actual phenomena, cases that have actually occured;' It_has two_phasee. description and inference. It may merely deal'with numbers of cases, in which case it is numerical; or it may'attempt to deal with magnitudes, in which case it is quantitative. . “ 5. Egperimental . In the experimental method in its pure form. ' ' a special hypothetical set-up becomes the basis of the conclusions, The effect of the different factors is determined by holding all conditions constant or uniform except the one whose "effects" are to be measured. and a definite amount of change in this condition is balanced against a definite amount of change in the result. Sometimes it is merely the effect of the presence or absence of a condition that is noted: in such case. the method is qualitative eXperimental in place of quantitative. In practice. it is often not possible to hold all conditions but the one constant, so that the procedure followed is a combination with the statistical method. Experimentalists are realizing increasingly that the two methods often need to be combined. In concluding the discussion of these five scientific methods, it should again be pointed out that the logiCal processes are the same for all of them. The experimental and the statistical method, for ex- ample, both go through a process of isolating the effect of a given variable. Although the actual procedures are Vastly different. the logic of them is the Same. ' (x) The Monist. April. 1928, p. 310, Also see Black, Introduction to Production Economics. D. 631. . ' 18 ,Historical ’ In some classifications of method, the his- 4 Research torical and the geographical are named as separate methods. It should be clear from the above that they represent a-classification upon another basis than the foregoing.. The 'coordinate classification with historical is contemporaneous. The logical difference between them is in the type of data used. Incident- ally, because of differences in the type and sources of data‘ histor- ical research makes use principally of the analogical, the case, and the informal statistical method, whereas the statistical method is used mostly with contemporaneous data, Professor Bernard would include archaeological and anthropoligcal as also coordinate classifications with historical and contemporaneous. w 'Geoggaphical Geographical research clearly uses contempor— Research aneous data for the most part — it is like the anthropological in this respect. The special feature of geographical research is that it deals with data of local differences, in which local differences figure to a marked extent, particularly as distinguished from the data of time sequence. {L ‘ LA vast amount of data, however, is sufficiently independent of - bothltime and place so that local and time differences can be largely ignored in the analysis. All of these methods and types of research will be taken up in detail in the body of the handbook. The foregoing is intended merely to serve as an outline of what is to follow. The'order which will be followed is statistical, informal statistical, case, analogical, and egperimental. Then will follow sections on the historical and geographical as fields or types of research. Social Science Although scientific method is the same in broad Research outline in all fields, there are important‘ differences in details of procedure growing out of differences in the data available or obtain— able. Obviously the social sciences cannot simply take over.ths methods used in the natural sciences. Neither, on the other hand, can they develop methods that are entirely foreign to them. In practice, it will seem that social scientists are giving special attention to certain phases of procedure. But this merely follows from the fact that these particular steps are more difficult in 'the social sciences. The attention being given to measurement of economic . value by economists, or to mental measurement by psych010gists, illustrates l this point amply. The same statements apply when research in economics and the; other social sciences is compared. In broad outline the method is the same, but the details and emphasis at the various steps are significantly different. For instance, the economist observes closely and attempts to measure mass changes over relatively short periods; the historian ob- serves them in a more general way over longer periods and usually has to be satisfied with mere statements by contemporaries or roughly ' accurate measures. The economist devotes much time to examing the representativeness of statistical data; the historian to the authen- ticity of explict statements by participants in historical events or contemporary observers of them. Research in agricultural economics is confronted by the special difficulty that the units of observation. the family farm, the local marketing agency, etc., are small and make few records of their activities; also a large proportion of the econ- ,omic decisions pertaining to agriculture focus around the individual farm as a center and are remote from actual markets where decisions are translated into monetary valuations. The close relationship be- tween the producing and consuming activities of the farm family is a further difficulty. Research method in the social sciences is still in the early stages of its evolution, and although the problems which it has to handle are essentially of the same logical types as those of the natural , seiences, they are usually much more difficult at almost every stage of the analysis, in the isolation‘of the problem, the securing of the sample for study, the definition of the unit, the exclusion of extran— . eous considerations, the reduction of the number of variables, the measurement of the varying factors, the measurement of the results or , the effects, etc. In fact, so difficult are many of these phases of _ most of the problems in the social sciences that it is an easy and common assumption of workers in the natural sciences that social prob- lems cannot be studied scientifically, that there is really no such thing as social science. But it must be admitted that truth is sometimes arrived at with respect to social problems as well as to problems of the physical universe. Then there must be some method by which truth is attained in such matters. It is the task of students in the social sciences to discover and define and classify these methods, _observe their essential modes of procedure, their successes and failures and the reasons for them. Perhaps some of the conventional practices of social scientists do not lead to sound conclusions. Perhaps many things in the past have been affirmed as truths following researches in this fie1d_which have later been found to be untruths or part truths. But this is true of biological and physical science; it does not mean that truths can never be successfully establiShed in the social field. Instead it points to the need for a more careful study of nethods. It was the realization of the foregoing that led to the establishment of the Social Science Research Council. and later to the 'setting up of a special subsidiary committee to consider methodology for problems in agricultural economics and rural sociology. Economics Ks. While it is inadvisable that the lines between Natural Science agricultural economic and other agricultural ,\ Research research should be sharply drawn, nevertheless " a clearer understanding of the spheres of each should help greatly in the administration of research. -Research projects in marketing can belong either in the field of econbmics or of applied natural science; but they should not 20 have the some content. The natural soienCe asyects of the marketing 0f flotat963. for example, include the scientificshases of disease con— t01. standardization of product, sorting and grading, preparation for shipment, care in transit, and storage. The economic sepects include sources of supply. factors affecting sufiply, consumption, factors affecting demand, factors affecting price, relation of price to pro- duction and consumption. relation of price to grade and class. market differentials, market organization, set-up of marketing business units. °°St °f marketing and the like. All of these latter call for a ready comfirehension of the principles of economics, that is, of the principles relating to price and value and the organization of productita.eff0rt for the maximum utilization of human and natural resources. Even such duestions as grades have both their economic and their statural science technique aspects. It is one thing t9 W°rk °ut.d°' bC?iPtions.of market grades_that can be effectively used in separating potatoes into grades and classes, and the necessary mechanical deViceS for putting these specifiéations into effect. It is an entirely dif‘ ferent matter to analyze_consumers"preferences and prices in such a way aSto determine what system of grades and classes. with what‘ dividing lines between them. will result in selling the crop f°r the largest not price and satisfy the largest volume of consumers"want9- The Bureau of Agricultural Economics has recently become aOPtely aware of the fact that its past work in market grades has been too ‘ largely on a purely physical basis. Suffice to say at this point that any organization of research in an institution that results in having the soonomic aspects of prob- lems handled by men trained merely in the applied natural sciences, or the converse of this, is to be condemned. Detailed discuSSion of the problems of organization of research arising out of the foregoing. Will be taken up in a later section. 21 R E S E A R C H P R O C E D U R E . mm. 1 Choice Of The first step, if the effort of the investiga- ' ' ‘ tor and the research funds placed at his dis- Project (x) posal are to be most effectively applied, is ‘ - to choose the most desirable project. In- ' ‘ , vestigational organizations of many sorts have been set up'in-response to the.general belief that a scientific attack upon the economic problems of agriculture gives the only adequate hope of their solution. [Such agencies are immediately confronted with the problem of developing a plan of operation and-program of activity., _ To this work the utmost care shOuld be given, since time, funds, and personnelhialmOst never admit of developing all parts of the field at once, and even were this possible, the problem of utilizing the results ,\ of investigational work demands that certain important issues_be select-i. ed for treatment, given adequate emphasis, and the results of research brought effectively‘to'bear upon_a public which cannot be expected to give attention to'more than one thing or a very limited number of thingsvat a time. . . , '- ' Bases:’ The first task'thon for any research worker 215 Relative. in choosing an individual project. or in_ importance v _' shaping up the program of an organization of which he has charge, is to weigh as fully as . , possible the relative importance of the several lines of investigation to which he conceivably might direct . attention. Here he may well be guided by four considerations. Firet, he may.seék to-determine the magnitude of the problem or extent of interests involved. Within the state or other geographic division to which his activities are related, a very much larger number of people will probably be Concerned in one or a very few situations or problems than in any other. Looked at from another angle, the value of the financial interest at stake may be much larger in one case than in another. The writer recalls an occasion in his own exw perience when his organization was urged to make investigations in connection with the problem of livestock marketing and that ofrxgycorn marketing in one of the Corn Belt states. 'Very little Consideration of the.relativo importance of the two.problems indicated that atten— tion should first be directed to the industry which touched a major source of income for a large part of the state rather than to the.‘ problem of limited and local scope.v In other cases naturally the decision will be much less easy to arrive at, particularly if compli- cated by other issues which may very appropriately enter into the decision as to the choice of project. However, the'actual worh (x) E G Nowill. Institute of Economics. 22 done both by institutions and individuals shows countless illustrations of effort being expended on problems of comparatively small importance while at the same time issues which involve large interests are neglected. 2..TimelineSS' A Second consideration which bulks large in the selec- tion of research projects is timeliness. While certain investigations are designed Quite properly to add to our basic knowledge of economic phenomena and have a comparatively slender tie to current happenings. a large part of research work in the field of agricultural economics and rural sociology grows out of maladjustments, uncertainties,fdr unsatisfactory conditions which clamor for prompt diagnosis, and if possible, remedial treatment,‘ While obviously the attention of re-, search workers should not be directed to passing problems and ephemc .ezal interests to the neglect of basic studies and forward-looking programs of research, such agencies need also to avoid any tendency toward other-worldiness and a neglect of current problems while the satisfactiOn of purely economic inquiry fills their minds. Timeliness frequently does not mean preparing for a program of relief. The old adage, "strike while the iron is hot" should be considered seriously, though sanely, by research workers in order that they may make examination of important and acute phases of their prob. lem, while they are in progress, when they can be analyzed to best advantage. A singde illustration will perhaps suffice. If we were to have really authentic information as to the character of the recent so-called" land-boom" in certain sections of the United States, it was important that research agencies should be on the alert to make first- _hand investigation of a situation at certain distinct stages, particu— larly as the movement reached its crest and broke, and thereafter as , it appeared to reach the bottom of the subsequent decline, with values stabilized or turning upwards. If we are to understand phenomena of this sort, we must accumulate bodies of data’taken first-hand at the moments when and in the places where such phenomena develop their most significant phases. Timelinesa may also be considered from the point of view of general public interest of the fitting of the work of one investigator or a particular institution into a larger program of work involving other experiment stations, the United States Department of Agriculture, or outside research agencies. It is hardly possible to place too much stress upon the fitting of particular research projects so far as practicable into general research programs if we are told to avoid the danger of fragmentary, miscellaneous, and provincial studies. 3. Available A‘third major consideration in the choice of resources. projects concerns the fitting of plans to available resources. It is lamentably true that in terms both of personnel, of funds, and of collaboration with other agencies we are all too limited and must fit ideal schemes to the practical limi- tations_of particular times and p1aces.} 23 Paseing the obvious fact that researches must be scaled in accordance with the funds available or which can be secured by inter- esting parties particularly concerned in the results of the investigap tion, a brief word may be said about personnel. The rapid expansion of research work in agricultural economics and rural sociology over a comparatively recent period has not permitted adequate training of all those whom it has been necessary to appoint to positions on re-' search staffs. Neither has it been pOSSible to carry out the process of selection, weeding out, and replacement in such a way as to get best results. It becomes peculiarly important, therefore, to devote to tasks of planning, direction, and major strategy those persons whose qualities, training, and experience fit them for this and to utilize persons of less maturity and more limited training to routine tasks or special types of work. To this end, it is fortunate that so great a variety of qualifications is called for in the gathering of field data, its tabulation and statistical analysis, and the preparation of a variety of written products designed to make investigational results available in a considerable range of uses. Experience very soon shows that the qualities which make an admirable field man, those which make a good planner and director of a research project, those which make a good statistical analyst, and so on, differ widely, and that too much care cannot be taken in selecting and adjusting personnel to the task or narrow range of tasks in which special qualities can be most effect- ively used. In this problem, of course, the difficulties of adjustments are enormously greater where the staff is small, since, particularly in state experiment stations, the range of problems pressing for attention is always quite varied. Even in this situation, however, much better results may be secured by careful planning, division of labor, and specific training. But after all this is done, there will still be the issue as to choice of those projects for attention to whose execution the available staff can be most effectively fitted. A small staff cannot afford to dissipate its energies over too wide a field. It may have to reject some projects merely in order to concentrate its attention properly. To complicate an already difficult situation, stress should be laid upon the choice of projects with reference to the availability of co-operating agencies and the desirability of adapting any program to the working out of a somewhat comprehensive plan in which several agencies will collaborate at the same time, or at times which syn- chronize sufficiently well so that the results will have their great- est comparability and that a problem of wide geographic range may be surveyed as a whole rather than piecemeal. Little more need be said under this head than to call attention to the desirability of co—ordinating research plans into such comprehensive programs, and the fact that under the leadership of the Association of Land Grant Colleges, the'Bureau of Agricultural Economics, the American Farm Economic Association, and the Social Science Research Council, move— ments of this sort are receiving more attention today than ever be- 24 ‘ fore. ,Gobd teamwork is prerequisite to the attainment of the best ‘research results, and fitting the efforts of the individual or the .particular'inStitution into such teanhplay should be given due weight in the shaping of our research programs. 4. Q§g_3§‘product. A fourth approach to the general problem of ‘ ‘ selection of projects may be made‘from the standpoint of use to which the product may be put. Here we shall echo some of the points made in discussing the question of ti eliness above. A large amount of our research work is very_proper- 1y undertaken with a view to its immediate value as a guide to remedialv activities. By and large. it may probably be safely admitted that this A'}ill be the first consideration in the choice of research projects under most circumstances. On the other hand, shaped merely to this proximate end, they will lack in depth and permanent value. In many cases in- vestigational activities shaped directly toward the thought of remedial' measures will prove disappointing in results and perhaps entirely ab— ortive owing to the fact that the general problem in which a oarticular issue had its setting has not been sutfficiontly explored before the more particular line of investigation was undertaken; For this reason. particular attention must be given in any continuing program of research work toward the laying out and system- atic prosecution of studies of an oxplorative character. The precise nature of issues to be given detailed investigation will frequently not appear at all from a cursory examination of the field or not 5' . be suggested by those who find themselves puzzled by or suffering frbm certain difficulties of which they have intimate contact but very in— complete perspective. For any soundly developing scheme of research work, therefore, it is necessary that explorative studies be under- taken not only at the beginning of the work but be carried along in such a way as to check the significance and relationship of more specialized studies and for planning the program of work for a suffic- ient period in advance. Finally, the complete development of research activities must include the prosecution of searching studies into fields of knowledge which do not promise to yield a product of immediately discernible practical use, but which are designed to add to our store of basic knowledge about the general phenomena of economic and social life with which immediate problems are concerned. Studies in statistical method as'such, with no clear idea of the use to be made of results if secured, would furnish an illustration of this type of research work, as would also both theoretical and statistical exploration into the nature of the price—making process, with ramifications into the field'of money and credit, commercial or- ganization, and cyclical movements.. Psychological and even political investigations of a highly theoretical character seem to be clearly required if we are to have an understanding of the general situation and forces within which our group activities develop. . ‘V 25 If we are to build wisely and solidly for the future, there.- fore, some of our resources and the time of some of our staffs who show particular aptitude for the pursuit of truth for its own sake must be allocated to studies whose practical value in any particu- lar instance is problematic but to which as a. whole we must look for~the development of the basic knowledge on which further progress in the field must depend. _‘ \25 PLANNING no: PROJECT The next step after choosing the project is to plan it. This will be tn en'up as a number of smaller stops in chronological order. It will be assumed that these steps all precede the final preparation of the project statement, although it is realized that in usual practice the project statement is couched in such general terms, or is understood as sufficient- ly subject to revision as to details, that most the steps here outlined, are scarcely needed as preliminary to it. If such is the case, what follows may be interpreted as further and more definite planning that :should follow such project statements before any actual assembling 0T. -enalysis of data is begun. , Altogether too little of such planning is ordinarily done. The tendency is simply to decide to study something, then get the data together and proceed to analyse them. The results of this careless and hasty planning are chiefly as follows: 1. Duplication of work already done elsewhere. 2. New results are not in form to be fitted into the picture of work already done so as to modify it as needs be and extend it. 3. Great waste results from attacking projects that yield little in the way of definite conclusions, even negative. 4. Much data is collected that is never used, or is not in definite enough form for analysis, 5. Data that are already available or can be ob- tained in better form elsewhere are col- lected from expensive and unsatisfactory sources. This is a common fault of surveys. The farmer is sought as a source for everything. 6. Most important of all, data that are essential to the analysis are not collected. 7. The method of assembling the data is not planned so as to expedite subsequent analysis. '~,‘ 8. The_senpling turns out to be bad; or the units or measures prove to be faulty; or the analytical attack breaks down: etc. §;nl.,Examinetion of It is ascepted as one of the first essential literature g5; procedures of sound scholarship in the natural othcr exoerience_ sciences that before any new experimentation in same 3321;. or analysis is taken, all prior work shall be carefully examined. The report of the ex- periment must contain a summary of this prior > work in its very first pages. The same practice is widely accepted in the social sciences also, although not so regularly followed. The failure of agricultural economists to follow it carefully is one of the reasons that fellow-departments in agricultural colleges are often skeptical of their work. It is true that the problem is some— what different in some types of agricultural.economic research. For example, it is hardly possible in the opening pages of a report on a survey to summarize all preceding surveys of the same or related types in the world. .Nevertheless, before the project is started, all the surveys in the same general area should be examined and their methods of attack and results compared and differences accounted for if possible. In projects more definitely of the research type; such as of factors affecting prices, the natural science procedure can be followed exactly in this matter. Dr. E. W. Allen of the Office of Experiment Stations makes an extremely pertinent criticism of many of the projects submitted to him. He says that one of the major results that will come out of many of them is that those working on them will get informed as to the subject in hand out of facts and analysis already available. Such is not research - it is learning the subject. But such learn— ing of the subject is highly necessary as a preliminary to a real project. The practical progrmn growing out of Dr. Allen‘s criticism is that most of such preliminary learning should be done before the project be submitted, so that the results of it can appear in the definiteness of the project statement and the keeness of the analysis of the problem it displays. The practical difficulty with such a procedure is that if it is not finally decided that the project is not Worth continuing, as should often be the case, there is no defin- its Purnell project to which the time and expense of the preliminary [work can be charged. Dr. Allen can no doubt suggest a practical solution of this difficulty. One might be to carry a Purnell item called "Preliminary work on projects". Although this might be - abused, it would be better SO than to haye half-developed project statements continually coming in.‘ It will aIWays be possible, however, for the senior members of staffs, because of wide aCQuaintance with the field and methods, to prepare project statements for the younger memters which will satisfy all reasonable requirements in the matter of project state- ments; and there will still be the need of having thosa directly handling the project work up the field. In any event, semething more thorough than is required for a project statement is needed before the project itself is undertaken. 28‘ The Job of working up the literature and other experience in the field should comprise the following: ' 1. Discover what results have already been obtained. 2. Learn what methods have been used - units and measures, sources, methods of collecting it, methods of ans elysis, etc. 8. Assemble all data from other sources already available that bear on the project. If g. I Learn the available descriptive and historical facts, _ that constitute the background of the problem and show its relation to other projects and to the rest‘ of the'field of knowledge. 5. understand the deductive analysis of the problem as worked out by others. » ._ l r 2. Preliminary The foregoing leads directly into the second deductive stage in projectéplanning, namely a careful analysis. rigorous deductive or qualitative analylis of the research problem in hand. Set over against this procedure is that of going out and getting all the facts that may possibly beer on a general subject, in whatever form seems most acceptable, and then seeing what can be made out of them. It is argued by a few that this latter is the scientific pro- cedure, that the other is likely to lead to certain conclusions en- ticipated from the preliminary qualitutive analysis. The answer is that all students of scientific methodology recognize that some sort .of an hypothesis as necessary as a basis for all purposeful research; that seldom does a scientist not have one even though he may be foolish enough to deny it. The essence of the scientific method lies in the accuracy and objectivity of the measurements and analysis, and not in prowling around in the darkness feeling for mysterious objects. The fact that some great discoveries have been accidentally does not prove that only objectives of a broad genernl‘nature should be set up. It is also a mistake to describe the deductive analysis which is made of economic problems preliminary to quantitativesresearch‘upon ‘ them merely as the setting up of hypotheses. It is more properly de- scribed as getting a clear view of all the factors and forces in the problem, and thinking out possible relationships between them. There has been at least two centuries of rather intense thinking as to re- lationships in the economic world — and there is much of it still to do. The applied sciences of ecbnomics offer a vast field for dis- covering more of such relationships. As society becomes more complex, new relationships come to the fore. It is possible for a person with no experience in economic thinking, having had little or no contact 29 with the thinking in terms of economic relationships by the best minds of the present and preceding generation, to envision many of the im- portant economic relationships in a research problem; but he has to be an unusual person. What a research worker poorly equipped in this respect does instead is to single out a few superficial relationships, sometimes already thoroughly discredited in the real economic world, ~and’couch-his-problem in terms of these. Even the man of the street may be sounder on many points of economic analysis than those who know just a little economics. For example, the answer which an or— dinary housewife will make as to how much it Costs her to make her childrens' clothes is likely to be better economics than that Of some home economics students Who have been taught a way of figuring it out in college. Likewise the ordinary farmer's statement as to what the value of land is based on, than that of the economist who has thought no further into the subject than merely that it is based on the income from it. Sound preliminary deductive analysis is as essential to sound conclusions from economic research as is sound analysis of the data. At the present stage of development of agricultural.econ~ omics, it is the more important. We have had tons and tons of data collected without sufficient close thinking as to the relationships involved in it. Even those accustomed to deductive analysis find their research powers as often limited on the side of the qualitas tive as on the side of quantitative analysis. The research workers on a project should attack this part of the preliminary work as follows: First, get all the help-on it possible from other projects and from literature of the subject. Second; follow back the economic analysis suggested into the general literature of economics bearing on the subject. Third: Talk over the problem with the best economists on the teaching staff in the department of economics, if there is such a department. Otherwise, seek help from the outside. Fourth: it will pay in many cases to write out an analysis of the problem along the foregoing lines and submit it to competent critics. 3. Statement of objectives. If the foregoing is 4. EstabliShment of boundaries— carefully done, the project articulation with other fields. can be narrowed down so that 5. Defining relations-ips to other other steps, as follows, may studies. _be quickly taken: First, the 6. Definition of the universe. objectives of it can be stated in specific concrete terms - in much more definite terms _ _ than found in the usual pro- ject statement. To illustrate, in place of a statement in terms of analyzing market business practices, one can be made which names the- specific practices to be analyzed and the plan of attack on them. The - farming system that is being analyzed can be definitely described to- gether 1th the aspects of it that are to receive special attention. Second: the articulation of the project with other fields of knowledge can be stated definitely and procedure specified for cooperating with .30 “yer securing the assistance of other departments whose field is related ‘to the project. For example, if the study is of factors determining the price of butter in the local market, plans can be stated under which the dairy product Specialists will analyze the butter as to quality. Third: A clear statement can be prepared as to how this project ties in with others already made or being made. and into the whole subject of agricultural economics and the rest of the social sciences. Fourth: The universe~which is to be studied can be defin— itely described and defined. In the case of a.study of farming systems, statements can be made as to what sorts of farms will be excluded from the survey — e. g. - farm specializing in the sale of pure-bred cattle in a dairy region. In the case of a price study, the grades and classes of the commodity and the period of time covered can be determined. ' l -\ 7. Units 229 The subject of units and'measures will be measures. discussed in detail in Part II. At this point, let it be said that the only safe procedure is to write out a careful defini- tion of all the statistical units used that are special to the problem in any particular - what is to be called a farm, a rented farm, a tenant, a cropper, a family, a house— hold, etc. Measures need to be even more thoroughly analyzed and described. Not a scrap of data other than for trial purposes, should be taken until there is definite understanding as to what measure or measures are going to be made in the analysis for size of farm, input of labor, input of each of the kinds of feed and ration, for value of land, interest rates, supplies obtained from the farm, volume of business .of an elevator or creamery, intensity of production, etc. There is no more difficult problem than that of measures in the whole field of the social sciences. 8. Method of analysis. The leaders of the project should have their plan of analytical attack so clearly in mind that they can visualize the sort of tables and graphs that will come out of it. This does not mean that these tables and graphs , will ultimately appear in the report. The results may prove to be en» tirely negative; or the data may yield only to some other sort of at— tack. It may not be possible to foresee half of what is likely to happen if the project is one which breaks new ground. Nevertheless the_ methods of analysis should be planned in advance as carefully as pos— sible. The planning should include ways of testing the sampling, of testing the units and measures used, of checking the accuracy of the data, of isolating the effect of different variables, of deflating for price level changes. etc. 81 9._.Preliminary exploration Before going further, it will of the field 3: the be advisable frequently to cen- universe. duct a sort of preliminary ex— ploration of the field of the universe which it is planned to study. 'lhus,.if the project involves studying leasing systems in detail in a certain‘area, it may be well to know what all the leasing systems are, and where they are localized, Before studying livestock shipping association problems in an area, it will be well to know where they are to be found. and the economic status of them in general. If this is already available from earlier or other studies, this step may not be necessary. Often— times, however, the other study is out-of-dnte and it is important to have such general information as the foregping reasonably accurate as to the current situation. The three main purposes of this prelimin- ary exploration are as follows: (l).to test the soundness of the decisions provided for above as to the universe, units and measures, etc.. (2) to furnish a general background for the study and fill in the gaps between areas or periods not included in the sample, (3) to furnish a basis for the sampling or for choosing representap tive areas; The preliminary exploration can frequently be made from census data and assessors' records and from official records of various kinds; but sometimes recourse has to be had to questionnaires and other re- connaisance methods of collecting data. 10. Sampling; We are now ready to narrow the choice of area study down to a particular area or period. or period of time in which we are to make our detailed ex- amination of the data. The 4 technique and philosophy of this will be discussed under Research Method in Part II. There is a very close relation between this and the interpretation of the re~ sults. It must be done with as full an understanding as possible of all the implications. Sampling may not be pOSsible in many cases. If the method of representative areas is followed instead, all the consequences must be understood and appreciated. The project is now in form so that the leaders can say it will begin in a certain place at'a certain time, and proceed to prepare the schedules and other details of execution of the actual project. memos or mime/(xi ‘Or 5. ion ‘ 'finder‘thi‘s head pill be -» included/a few-observ/ ' “ f,.,‘ ,.vations of\5"2§neral nature, Certain parts ‘ of it, such as the collection and analysis of data. will be considered in detail under re- search method. The choice of projects has been considered in a preceding section of the handbook. 'There is a close interrelation between the choice of project and the staff or- ganization for carrying it out. It would indeed simplify matters if for every new project it were possible to draw into the research or‘ ganization men well equipped with both theoretical and practical {training and contacts for that‘particular'project. This would in- ‘ volve either a large research staff or that the field of research be limited to a series of closely related projects. \ In a smaller organization trying to cover a wide field of work the research worker who changes from a project in grain marketing to One in livestock marketing of neCSSsity has to spend much time in making contacts and familiarizing himself with the surroundings and Conditions of the livestock industry. If the change were to a prob—.. lem in taxation the Idjustment would be even more difficult to make:- so difficult that it is doubtful if it should be undertaken. In organizing the staff for work upon a particular project two things are essential. first, familiarity with the field in which the work is to be done, and second, a knowledge of the particular method of study which is to be employed. Lacking either of these, the conclusions are apt to be sterile if not misleading. If the two characteristics can be found in the one worker so much the better. If not, they should be secured by a combination of workers or ad- visors. With capable leadership and direction secured, the remainder of the necessary staff may be recruited from less experienced in— dividuals of aptitude. The double purpose will thereby be served of economy, and providing training for less experienced workers. Aptitude for the various phases should be censidered in assigning workers to the various parts of the project. Some of the workers may be especial— ly fitted for the field work, others for the compiling or interpreter tion of data. An appraisal of the-capabilities of the various workers is one of the first essentials of staff organization. Budget. Past experience is the best guide in making a budget. More often the budget is made too small then too large. The problem is usually that of making the means equal the needs. Un- expected phases will turn up which should be pursued. A frequent error is that of underestimating the amount of clerical help which will be needed. Often a fairly good estimate is made of the amount of time re- (xj J. I. Falconer. Ohio State University. 4‘.— .... u“ e 33 ‘ ‘ - ; éuired for the field work but inadequate provision made for the compil- ing and interpretation of the data. When a specific project is started a~hudget should be made both for the current year and for the entire pro— ,l.—- ject sosfar as the extent of the le.tter can be forcasted. Some degree-'9 of flexibility in the budget assigned to particular projects is always - an advantage.. Establishing Contacts Before undertaking the study, contact and Working Relations. should often be made with the leaders 5r . in the group from whom data are to be secured and with those whom the results of the study aré'especially hoped to 7 serve. Care in this regard will serve the double purpose of guarding against pbssible future misapprehension as to the nature and reason,for the study, and should yield many useful suggestions to the workers-in shaping and developing the study as well as facilitating the extension of the results when.completed1 In field studies, contacts should al- ways first be made with the county agricultural agents of the region. where work is to be done. -If work is to be done among organization members, the leaders of the organizatiOn should be fully acquainted with the plan. Establishing and maintaining these contacts is an essential to any research group which hopes to fully serve the area or group which it roachea.~ ' '- Tentative,schedules.‘ ‘ f , .If the nature of the study requires a schedule, the schedule finally used 1 will probably be the product of an ev— olution. The first draft should be prepared by the leader of the pro- Ject, after having made the necessary preliminary exploration of the - subject and other research work done on it or in progress. He shOuld consult specialists in related departments about the various details of it and frame his questions according to the suggestions. He should have the final schedule criticized by his Colleagues who have had ex— perience with schedules and by subject-matter specialists in related- fields.. Then he is ready to have it tried out in a preliminary way; A few schedules should be filled out and a preliminary analysis and interpretation made. This will frequently uncover gaps in the data, points which were sought but canno+ be secured, questions'which are frequently misunderstood by the one interviewed; or the preliminary study will reveal additional data, much desired but not known to be available when the study was originally planned, or that the schedule is covering too wide a range of subjects. If several workers are 'using the same schedule, another product of such a preliminary attack will be that each worker will have been taught to give the same ins terpret ation to the various questions of the schedule. Many a field study has been completed only to find that the various workers.eere giving different interpretations to the same question. . - Before revising the schedule, the data secured in the pre- fiiminary study should be examined as to its adequacy to serve the purpose needed.vat may be found.that-adequate_data are not obtain- able on many points in the schedules, or that the answers have-been recorded in a way that prevents their being tabulated or interpretated. ... ,.-- ‘\ _ __ \__ "“ “\u‘. -1,» .-._..,__.... 34 If the project is a large one, it will pay to take enough schedules in a preliminary survey so that the results can be tested by fre— quency distribution analysis and the other tests of sampling. Other POSSible tests may be such as the following:_checking expenditures against receipts, checking feed consumption of livestock by feeding standards, checking individual schedules against assessors‘ reports when available. A conference of all the workers in which each ques— tion is analyzed thoroughly will also help. Collection 9: data. There may be one or many engaged in the collection of data. If only one there is the merit that the data will more likely be collected and entered on a comparable basis, and that one person will be more intimately acquainted with all phases of the data. If mahy are Collecting the data, however, the work will progress much faster. In the latter case some one person should be in charge of the group and check each schedule as it is completed. In this way errors can be uncovered and differences of interpretation between the various workers detected. When this plan is followed, workers of less experience can be employed in charge of an experienced leader. Expense is thus reduced and training given to new workers. The number of workers which.can with advantage be employed will de- pend upon the nature of the schedule. If the work is largely routine, a large number may be employed; if however, it is a matter of ap— praisal, better results will be secured with fewer workers or more experience. Recognition should be given the fact that frequently the worker who will excel in the interpretation of data may not excel in its collection. This may make desirable a division of labor. Un- tactful work in the collecting of data has diminished the usefuleness of many a study and many a research organization. Appraisal and When all the data that are available or planning 9: analysis. at that time will permit have been as— sembled, an appraisal should be made Of its adequacy. This should engage the attention of both those who collected. the data and of those who- are to analyze them. A thorough appraisal of the data before working them over will save much time and wasted effort in their analysis and interpretation. While frequently the error is made of attempting to get more out of the data than their extent or character would justify, a more frequent error is in not fully utilizing that which is available. _Such an apiraisal may dis— close gaps which- still exist in the data or may reveal possibilities of analysis beyond that which was originally intended. If possible to do so. these gaps in the data should be filled, so that the final results may be more complete. Or as will be more frequently the case, such an appraisal will reveal fields for further work. FollOwing this appraisal, the plan of analysis should be revised as necessary. 35 ._Drawing conclusions. All through the analysis, but especially toward the end, when the results begin to appear. interpretation is called for. This requires vision, imagination and caution as well as a knowledge of the field of work and the methods of collect- ing and analyzing the data. The judicious weighing and summarizing of the results is the capstone of a research project._ The tragedy of many a project has lain in the collecting of a large volume of facts which were not later adequately analyzed and carefully interpretated. The subject of inferences from data will be handled in a later section. Wherever possible, first inferences should be checked by further an- alysis of the data; as well as by submitting them to other workers in the same or allied fields. No inference which appears unreasonable should be accepted merely on the basis of the data. PART ONE W A STATISTICAL METHOD. Statistical method is taken up first partly because it “gig the main resort of agricultural economists and partly because it ~isimnch"simpler to present the other four methods after then before statistical method. be) What is Statistical Method. Statistical method includes the collection, summarizing. analysis judgment and presentation of numerical data of-mass pheno~. menon. x The first part of this statement names the five main' divisions of statistical method. There is a question as to whether this statement includes as specifically as it should the matter of choice of units and measures cnd measuring. The data of statistics are numerical. although this feature is not conspicuous when the method is applied informally with a limited number of cases. The phenomenon observed usually involves the behavior attributes of a large group of individuals. Hence the "mase phenomenon." This statement is most clearly applicable to timeaseries changes. Seldom does any economic change take place that does not have wide ramificap tions involving whole masses of the social substance. Statistical‘, method is also involved, however, in such problems as summarising phySical measurements or the results of simple psychological exp periments. Mass as applied to such situations means simply a number of cases. , As already pointed out, the essence of such data is that t "in the actual universe, nothing is same, nothing can ever be act- } ually repeated." They are like this because in each individual case there is at least a slightly different combination of attrib- utes or causes. No two sunrises are exactly alike, to take an extreme case. But thus far none of them has been sufficiently different to change the main feature of the event. In social phenomenon, these differences are always affecting the main event being observed, sometimes even preventing it altogether. , This characteristic of statistical data furnishes star tistics with its essential problem, that of "generalizing im- perfectly comparable data." The method of statistics for handling (x) Professor Warren M. Person's statement. \ ‘- 37 this problem is to give to these "unlike or related items a common~ _denominator and thus a common classification or.perspective."(x) The farms in an-area surveyed differ in a thousand particulars.r Statistical method abstracts certain attributes common to all these farms, and analyzes variations in these and associations between them. The different marketing years for potatoes have many special attributes. Statistical method abstracts certain of these. The problem of statistics-is to make some statements on the basis of these abstractions that in the first place describeythe sample, and in the second place indiCate probabilities for a largfir universe. The difficulty of this latter is that the rest of the universe is never exactly like the part of it that has been examined. "' The following excerpt from "The Grammar of Science" shows the relation of statistics to logic: "Such a contingency table as we have schemed above is the numerical syllegism of ob— servational science, which replaces for all its purposes the barren syllogism of the old Aristo‘ telian logic. We do not say, 'Some of B is A,’ but we state numerically how much of each class of B is associated with each category of A. In actual practise, of course, it is impossible to form a table of the whole population or the whole universe of A and B things. we take here as elsewhere a "samplb" to illustrate that universe, and we have to take great precautions not only ' that this is a true sampde, but that our inferences from the sample may be applied to the universe under discussion. The theory of samples-—their probable errors and legitimate use~-is the chief topic of modern scientific statistics."(§i) There is, however, this difference between the syllogisms of logic and those of statistics ~ those of logic, once the premises .are accepted, lead to incontrovertil&d_conclusions, whereas those of statistics, as already pointed out, predicate only probabilities. .’ ' Statistical‘pgbcedure assumes premises also. 'These enter in partly by the way of mathematics and partly in connection with the nature of the phenomena. Statistical method uses mathematics as a tool. There are mathematicians who go so far as to identifir' statistics absolutely with mathematics, to say that statistical analysis leads by incontrovertible steps to the same iron-clad con- clusions as formal mathematics. The economic statisticians, how— ever, generally take the position that the mathematics of sampling ' and error and inference thus far developed, which holds rigoroulsy only for pure chance and "simple samples" of entirely unrelated (1) Bernard, 0p. cit., p. 314, 519 (xx) "The Grammar of Science", p. 160-161. 58 ' ‘events. is inadequate for the‘needs of economic phenomena. and that there is little prospect of mathematical analysis soon being developed which will be adequate. Once the assumptions of puregphance are ' violated, inference has to proceed along other lines than those based on simple mathematical probability. The nature and extent of the relatedness between the events, or occurrences of attributes, become significant factors in the generalization for the sample itself and the inferences from the sample for the larger universe.‘ Probable inference gives place to statistical inference;(x) Mathematics may in time set up formulae comparable to those of probable error and probable inference for certain significant typical sets of events other than those of pure chance; but it has a long task ahead if it expects to provide for all the contingencies which social phenbmona , offer. When mathematics does so provide, it usually sets up so many unreal assumptions that the formulas are of little applicability. To the extent that statistics makes use of the mathematical formulae of pure chance, it must accept all their assumptions. How the legic of mathematics starts with asSumptions and introduces more at nearly every step, is not generally realized by the ordinary man- ipulator of formulae. This is why men trained to a limited extent in mathematics with no understanding of the philosophy of it rely so implicitly upon formulae when they enter the field of statistics. (b) Units and Measures — Definition 9: Terms. The importance of having definite and clearly stated sta» tistical units and measures and terms and concepts has already been sufficiently emphasized. The purpose of this Section is to state the principal considerations involved in setting these up and to .apply these considerations to the field of agricultural economics. Problems of this kind are especially difficult in the social sciences because of the great variability of individuals and the complexity of phenomena. This is amply illustrated by the efforts of public agencies to define income and determine the value of public utilities: of economists to agree upon such concepts as land, capital and profita. _ Developing such units and concepts is one of the major tasks of any young scienCe such as agricultural scenomics. The workers in new fields are always eager to set up units and measures and concepts and to have these standardized and generally accepted. No argument is needed as to the importance of having a large number of these in common use and generally understood. The sooner such a condition is reached in any science the better. But there are also great losses from too early standardization. Progress along this line must come as a process of evolution, each new generation of units and measures and concepts being different from the last, but out of this process coming slowly the stating of certain ones in terms that change but little thereafter. It is unreasonable to expect that the (x) See article by E. B. Wilson in Science. Mch. l926. Also note by same Journal of American Statistical Asseciatien,,3une 1927. 39 a ‘first products_of such effort will long prove to be entirely satis— 'factbry. ->If they should happen_to come into wide use at the start, the time of‘theirfbeing discarded or modified is likely to be post- poned too long; and progress seriously impeded for a while before and Just after this happens.' For reasons that need not be dis— cussed, such early standardization of certain concepts and measures occurred in‘the field of agricultural economics, with the.usual con- sequences. There is always danger also that the need for comparable results over a wide area or a long period will unduly’influence de- cisions in this respect. Although uniform practice may begnecessary in a large grOUp of similar projects in many states, this muuttnqt interfere with setting up improved units and measures for new groups of projects later. ' - ' For the present, therefore, the great need in agricultural economics in this respect is a careful analysis of units and measures_ and concepts whenever a new project is outlined. especially one which -is likely to be repeated in many states,and choosing the ones which are best qualified to serve the desired ends; and along with this an extremely careful definition of all such units and measures and cone cepts, so-that others can follow the results closely. .Emphasis upon this latter rather than standardization is the need for the present; .Qnii§_2£_gggg- Let us consider first the type of statistical 42§2£12£ unit that is involved when an enumeration or count is made, as when the census is taken, or automobile accidents are recorded. With our increasing efforts to record current developments, this is becoming an important feature of research. What shall be defined as a "sale" when transfers of farm real estate are _ recorded, or as a "forced sale"? Shall large tracts of cut-over land be classed as farm land in such sales records, likewise lake-shore property? Census enumerations furnish the outstanding example of ths" need of such units - a farm, a farm family, a mortgaged farm, a tenant, etc. Following are a few important considerations with respect to such units: . 1. All possible doubtful or border—line cases must be an- ticipated before the enumerating and recording is begun, and a defini- tion, and frequently a set of instructions in addition, prepared for the guidance of the enumerators or recorders, and for permanent record. This usually requires that the definitions be tried out rather ex- tensively before they are put into final form. For example, such cases as the foregoing may need to be provided for in the definition of a farm family: a son who lives with his parents and operates an- .other farm nearby; a family that lives in town but has a farm in the country upon which nearly all the labor is hired, the husband having more or less of a city occupation besides. If the latter is ruled out because the family is "not living on the farm", what will be done with those families which live in a village but drive out'to the farm to work? The reports of such enumerations should include ' a full statement of the procedure followed. 2. Such definitions will often need to be arbitrary to a cer— tain degree. For example, a family which earns its living partly from- factory work and partly from farming Operations, living possibly either in a village or on a farm nearby, has as much right to be classed as having manufacturing as farming as its occupation. In an enumeration, it must obviously be classified as one or the other, although sometimes double counting or an intermediate classification is possible. Arbitrary demarcations which are established in such cases should be logically consistent with the purposes of the enumeration, follow current usage, and be workable. These several requirements may, however conflict with each other. For example, current usage may be illogical or inconsis- tent with the purposes of the enumeration. As an eanple of inconsis- tency is the exclusion of rented farms from the mortgaged group; bUt thus far no other arrangement has proved workable. 3. The units should be named according to current usage if p05— sible, so that when the data pass into popular use without the techni~ cal definitions attached they will not be used incorrectly. There is a Question as to whether the federal census' definition of a farm does not include a different sample of establishments than the popular con- cept. particularly inclining more to those with inconsequential farming Operations- The classification "gainfully employed"’is commondy recogr _ nized as a-misnomer. "Income" is a dangerous term to use in any except ‘ the currently accepted sense — as witness the extent to which ”labor income" data have been used by the general public as evidence of agri— culture's well-being. 4. In planning units and measures, some attention should be given to the probable use of the data, the education and understanding of those who will use them, and their disposition to biased interpre— tations. If per—capita farm incomes are likely to be compared directly with per—capita urban incomes, the definition of "gainfully employed in agriculture" should take this into account if possible. 5. Comparability is much more important with units of enumer~ ation than with other statistical units and with measures. The use of census data and official records is largely in comparisons between areas and periods, and if the units are different, the data may be nearly valueless. Complete comparability is of course impossible in most cases. A good definition of a dairy cow in one period does not fit as well in some later one. The larger the area included, the poorer the fit of the unit in any part of it. The world census of agriculture will raise many difficult questions of this nature. 41 “gaits of ' , The foregoing discusSion assumes a Samples - definition complete count of all individuals or of the universe. cases. Much of statistical analysis “I is based on a limited number of cases taken at random or selected from the . whole number. .Thisgintroduces a few special considerations. In 'the first place, the nuniverse" set up to be sampled is often less ,broad than that enumerated. Studies based on samples be representa» tive selections are likely to be confined to a certain class;'or to exclude the sorts of cases that are unusual in the area. Eor:examp16, _it may be limited to dairy farms only, or may exclude certified milk farmS. This makes the need for definition of'the'unit especially .urgent. Such studies can be made to.conform more nearly to scientific standards; less attention need be given to popular consumption. Com—‘ parability with other data is much less significant, but still import— -ant in its place. Definition of the universe is one of the first things to be done in planning an investigation.‘ If the collection of data is to be a part of the study, the universe should be defined before the collection of data is begun. If secondary data are to be used in the study, the universe should be defined before the data to be an- alyzed are selected ' J fl'“ a To be sure, problems may arise in connection with the collec- tion of the data that will necessitate a revision of the definition of the universe: and when secondary data are used, the lack of data drawn from the univerSe as originally defined mey_make a revision necessary. For example, in a study of incomes from farming and their relation to various factors undertaken by the Bureau of Agricultural.Economics some years ago, the original intention was that the universe should include all farms in the United States, and that the data would be collected by sending questionnaires through the mail to the crop re- porters of the Bureau, asking them for the details of their business 'for the preceding year. An analysis of the replies to the question-‘ naire showed, however, that practically no reports had been received .from tenant farmers, due to the fact that very few tenant farmers are casked to serve as voluntary crop reporters. It was further found that very few replies were received from farmers operating small- farms. Since it was impracticable to obtain comprehensive mailing lists of tenant farmers and of farmers operating small farms. it was .necessary to revise the definition of the universe and include in it lonly ownerebperated farms of more than ten acres in in size. ~I . This study might have-been begun, as many have been, without ..'haring attempted to define the universe to which the results were to '\ apply, but if so, and if the data secured had not been studiea to determine if all classes of farms were represented, the investigation_ = Would never have known to what extent he was warranted in generalizing. , from the results of his sample. -... M- .. ._- ”74- us- a. _ . ,». . 42 To paraphrase a passage from Mr. Bowley's text on the “Elements'of Statistics: One of the first things to be done is to determine what is £9 33 measured, and one of the last things to be done.is to determine whatwhas been measured. Subgfiiasses Definition of units is obuiously closely '“ » associated with the problem of classificar tion. In the case of an enumeration, the sub-divisions of a main class must be so definedcfihat: (a) They are mutually ex- clusive, and'(b) include Somewhere all cases comprised under the definition of the main class. The problem of sub—classes arises frequently in connection .qith types of land or soil or grades of commodities. The differences ‘Between classes may be essentially quantitative, but lack of a measure compels resort to qualitative distinctions. The problem of defini- t‘ion ‘then takes on them of market grading. Mggfigggg. Statistical analysis usually involves the further steps of measuring some of the at- tributes of the individuals falling within the classes defined, grouping these attrib- utes into frequencies, and in many cases discovering to what extent and at what rate variations in amounts of wthese attributes are asSOciated with variations in amounts of §ther attributes. Perhaps supply and price are to be measured and asso- _ciated, or rainfall and yield. Some of the least—solved of all problems in economics lie in this sphere,(x) At least two problems are always involved in such an undertaking, one, the‘definition of the attributes, and the other the finding of a unit of measure for it. Thus efficiency has to be defined as an attribute before a unit of measure can be found for it. In many cases, the attribute itself cannot be measured, but some approximate indicator of it is taken instead ~ as death-rates are sometimes accepted as a measure of healthfulness. This is the third problem connected with measures. Let us first consider the problem of definition of the at- tributes. Teke size, for example. Is it to be a physical concept or an economic concept? If a physical concept, is it to relate to mere area or volume, or to power to turn out a physical product, or to ability to utilize a physical input? The answer must depend upon the objective of the analysis. If it is merely description, then the more frequency distributions according to different con— cepts of size, the more complete the description. If, however, certain relationships are to be analyzed, then the concept must be defined in consistency therewith. If, for example, size is wanted as a measure of capacity to use other factors to advantage, the definition will probably need to be in terms of the input factors. If change in.size of the farm unit from decade to decade is desired, perhaps acres or value of output will serve best. - (x See Professor Irving Fisher's article on the measurement of? utility in the recent John Bates Clams Memorial collection. .43 , The concept of efficiency is one which is loosely used. Strictly defined, it refers to the output returned per unit of input. In"efficiently managed farm household is one which yields a large output in satisfactions per unit of the cash expenditures, farm supplies and household labor that goes into it.. Productivity is a mere comprehensive term, taking account of capacity, or the amount of the input factors that are handled, as well as of the input per unit of output. ‘ v ; -i. . Cost is a term that usually needs defining. Applied to a production problem, is it the value of a certain fixed inputythat proves to be necessary to call forth production - that is "necessary' expense", or is it merely the value of whatever inputs happen to occur?. Applied to living, is it the value of a certain "necessary" budget,_or is it the value of goods actually consumed in any given case? The term cost is used loosely in both senses, often in the same study. ' " ' Value in the economic sense is the most difficult of all, , attributes to handle, and at the same time the attribute which economic statistics is called upon most often to handle. As with many other attributes, it frequently is not possible to measure value with suf- ‘ ficient precision for scientific purposes. Either some other why'must*~' be found for handling the problem, or it cannot be solved at all. Public utility commissions and other such public agencies are frequently constrained by law to make vaulations, and the courts have under such pressure developed a rough'set-sf working rules and definitions to be used in such situations.. But no economist pretends that the values so obtained have even approximate scientific validity in the maJority of cases. The rulingfimmfticalphilopophy of the courts has more to do with the valuations than any scientific considerations. It is beyond the scope of this handbook to discuss theories of value. Obviously all that has ever been said that illuminates the subject is pertinent here. Another attribute which needs defining is "the standard’of living'I of a family or group, now most acceptably defined in the words of Professor Davenport as.the goods which we feel the deprivap tion of if We do not have them - a sort of customary plane and content of living. . ' The foregoing illustrate the problem of defining the at— tribute to be measured. Next is the problem of the measure for it. The efficiency of a producing unit may show itself in output per unit . of imput; but the efficiency of the input factors affects the size of the product too, and there seems to be no wayttelling how much of the output is due to the efficiency of each. Practice has been to at- tribute all to the factor whose efficiency was being analyzed at the time - to attribute it all to the land at one time, to the farmer at another, to the cows another, and to the good varieties of crops at another. Another procedure has been to assign values to theother,~ inputs and use the residual as a measure of the efficiency of the 'one in question. Thus "labor income" was conceived. It could be used with equal validity to get farm efficiencgiand cow efficiency' J.lf"asoatisfactory‘velnntion could be obtained- for the proprietor‘s “contrihutdon to the product. ' '1 “Cost of living" defined as the sum of the values-of goods of“ actually consumed, has been used both as a measure of efficiency 0f~ rural living and of the plane of living. A low "cost of living" may, indeed reflect efficient household management. But'it cannot at the‘ same time indicate a low plane of living. ' According to the most generally accepted analysis at present. the value of goods entering into production is to be measured in terms of the other alternative uses to which they could equally_well be put. , These other uses could be working at other employments in some cases, producing other crops in other cases, selling in the market—place in ad here, leisure and recreation in others, etc. It is highly important that this principle of value be applied in a realistic, that is, scien— tific wage In the case of cash expenditures for productive goods, there is no question about the reality of the alternative use value. It can be assumed that the market price reflects a balancing of all the alternative employments. The other employments, however, are not so real in the case of proprietor's labor, If the problem under cone sideration is necessary price for farm products to secure an adequate 'volume of production, the comparison of alternatives centers around the choice of each particular person as to whether he wants to be a v farmer or not. The alternative use value in such a case is what each farmer could earn in other occupations, or perhaps in other systems of farming. In this comparison, however, differences in living con— ditions, prices of food, farm—produced supplies, preferences as to occupation and a score of other things mus t be taken into account. Seven hundred dollars on the farm may outveigh twice that in the city. If the problem under consideration is not whether or not to keep these men farming, but rather the more usual one of the best ‘ way 33 utilize their effort 23 their present farms, then the only realistic analysis is in terms of other employments while remaining on these farms, and these employments in most cases are in the pro- duction of other farm products. Likewise, family labor assuming the second type of problemp utilization while remaining on these-farms - will find most of its other employments at other lines of-activity on the same farms, al-. though occasionally there will be an'otherwise equally desired alter- native of working for a neighbor or in a factory or mine nearby. Valuing family labor at what it would cost to hire an equal amount of work done is a misapplication of the principle of alternative- use value”; It does not represent the value of the alternative employments of these particular workers. ,Marketable farm—produced feeds are properly valued at what they will sell for in the local markets if, and only if, first, a system of farming in which these particular feeds were sold for cash would maintain itself over a period - otherwise the requirement "equally well" is not met; or, second, a system would maintain itself in which it-was the practice to sell for cash When the market was right and not at other times. Any other scheme of valuation is hypothetical and unreal. 45’ -..“ . ”‘Thesaixernativeause'Valueamrwanmaa~rarm-1and in any gfiVen“ use is the net value of its product..differences in effect on plant food supply, in effect—on—soil texture and weed and disease control, etc., having been included in the net value. if planted to the most available other crop for the same place in a crop rotation; 0T per~ haps as based on the same analysis applied to whole crop rotations taken as alternatives. There are problems of a similar nature in using the alter- native-use value as a measure of the value of the labor of chores and field work.-of labor at different seasons of the year, and even at different days of the week - in case it is desired to value the inputs of different farm products separately. The alternative—use values on a farm may even vary during the day according to the weather. ‘ - Alternativenuses are certainly different for each farm. It is more necessary to adjust valuations of agricultural productive agents according to conditions on each farm than it is valuations of factory productive agents according to each city. The individual farm is the immediate market center for valuations of the productiVe agents used on a farm - except when the problem is that of necessary price to keep up the supply of farmers; The third problem is that of using indicators of the at- tributes in cases where the attribute itself cannot be measured. Thus rate at which land is cleared may be used as an indicator of progress of individual settlers. Obviously it will not describe: the progress of those farmers who give more attention to buildings than to land clearing. Number of families having heating systems or bath—tubs in their homes-may be equally poor measures for the plane of living of different communities. It is not well to rely upon any one measure of this kind; and sometimes it is possible to use a considerable list of them. The essential requirements of good measures are that they shall relate closely to the attribute in question, and that varia- tions in them shall be strictly proportional to variations in the attribute in question. Suppose the problem is to compare the climate of areas as to suitability for farming. Average frost— free period is one measure commonly used. But it has several defects - one that it counts too heavily the cool but frost—free springs and autumns of lake—shore areas; another that it does not adequately include effective heat and sunlight. Perhaps the follow— ing simple measure would be more adequate in many cases: number of years in twenty in which corn matures before frost. ' Composite Efforts are constantly being made to set up Units composite units — such as animal units, pro- ductive work units, adult equivalents, total cost per unit. There are great dangers to be avoided in such composites. One is that of confusing or burying the attributes or conditions which it is de- sired to study - as a total cost per unit may largely confuse the. / 1'46 ”1 “variations in feed rations, which areuusually the most essential thing *V,L\ .to analyze." tAnother is that/the basis of combination may not relate ;.~ ” T to the attribute being studies — for example,-the adult—equivalent- ‘-sfi. unit used in rural living analysis may be based on food consumption .[ bhly, A physical basis of combination may not fit in economic an- \ alyefsw and vice versa. é 9222312: The measures used often take the form of co- 3 isnigr efficients, that is, ratios - such as the birth rates, death rates, suicide rates and the like with which we are familiar. Research ,' workers are constantly setting up new ones rsuited to the needs of their projects. Even the conventional ones are .-constnntly revealing weaknesses in use; and the new inventions often prove entirely untrustworthy. Statisticians have developed a number ‘fig rules that assist as safeguards if observed closely. Every co- efficient in effect has a numerator and a_denominator. The numerator in a death rate is the number of persons dying in a given period, and -the denominator is properly a certain number of persons, taken as a base, who were exposed to the event of death during the period. The rules mostly related to the choice of the-numerator and denominator. , 1. Select the quantities used as numerator and denominator, »in each case, so that the qutients derived may be legitimately com- pared.(i) ,To illustrate the intensity of dairying in different regions might be measured in terms of coefficients'heying either number of dairy cows, or milk production in gallons. or butterfat production in pounds, as numerator; and having either land in farms, improved land in farms or crop acres as denominator. No one of these, as a matter of fact, would furnish a legitimate base for a comparison. Percentage of farm income from dairy products would give different results than any of the above, but would not take account of the smell total pro- duct of rough or sandy areas. When no coefficient can be devised _ that meets requirements, it is best to use two or three different 32%;, ones. In other cases, a reasonably close approximation can be obtained by "refining" the group included in the denominator until it represents conditions closely identical with those specified in the numerator. Thus per-capita incomes of rural and urban groups become more comparable when reduced to equivalent age and sex distributions. or geographic conditions. Per-capita farm incomes for the nation as a whole reflect If“ * southern conditions much more largely than do per—capita urban in- ’ {» comes for the nation taken as a whole. The geographic conditions are not equivalent. The are or period included in the denominator should be con— ? sistent with the subject of the numerator. For example. to compute _ percentage of land in a certain crop and include Considerable land i where the crop could not possibly be grown would vitiate any Com— ?t; (E) King, Elements of Statistical Method, p. 42. \ \ \ \ 47 parisons. A rate of taking land into farms might properly occlude forest.reserves or even 1arge_sections of swamp, desert or rock ‘ outcrop. . The scope of the numerator and the denominator should be as nearly identical as possible with the attribute or relationship being examined. Thus if intensity of cultivation is the attribute, the data and measures'used should reflect this only. Coefficients which mix several attributes or relationships should be avoided if possible. For example, "Crop acres per man" may be high because of growing laboruextensive crops of doing is; latively little crop work per acre on a given crop (e.g.-cultivating the corn only a few times) or of effecient use of man labor (large machinery units, keeping men employed, etc.) Some of those are essentially opposing relationships. Gerrelations between crop .acres per man and output are therefore likely to give inconsistent ; results. a The names applied to these coefficients should be carefully chosen. For example, the "purchasing power" coefficients heye often been misunderstood. ' The foregoing does not introduce discussion of the standardh ized coefficients which have been developed for general statistical measurement - arithmetic means, modes, index numbers, standard divia~ , tion, coefficients of variability, correlation and regression, etc. . These will be handled in their proper setting in the handbook. (0) Errors in'Data. Statistical literature contains altogether too little dis- ' cussion of the problem of error in the data themselves, of the types of such error and their causes, of methods of testing it, and of the effect of it upon the validity of conclusions. The usual textbook a in statistics distinguishes between biased and compensating errors and lets the matter rest except for an occasional Warning that "accuracy in the original data is assumed." ‘ ‘ \ . The distinction must be borne in_mind at this point between primary statistics, those collected by those making the study, or under their direction, and specifically for the purpose of their study; and secondary statistics, those Collected by other agencies and for other or more general purposes. Census data, market quOtations,and data of carlot shipments are secondary statistics. \ » Let us first consider errors in primary data and list some ,of the important types of it. The whole subject will be taken up more specifically later when the various methods of collecting data are introduced. \ 48 l. Misunderstanding of the question or definition - See discussion of this under ”Surveys” and "QueStionnaires". Data af- fected in this way must in most cases be thrown out. " 2. Closely related to the foregoing, reporting for a different unit or a different set of facts than those intended. Thus an inquiry about taxes may be understood to mean property taxes by some, all taxes by others. 3. Overlooking part of what should be included in the answer. Inquiries relating to such things as receipts and expendi— tures, number of livestock, etc., are most likely to be affected in this way.‘ No doubt the census onumerations generally omit import- ant items of production and sales. The usual protection against this is to list all the items entering into the result separately. «" 4. Confusing diflerent years - reporting 1927 taxes in place of 1926. This becomes 5 rious if data are'wanted for a series of years. 5. Failure to remember amounts correctly - a bias upward or dDanard. For example, crop reporters may generally overstate last year's crops as compared with this year‘s. 6. Intentional under - or over—statement, because of fear of possible relation to taxes or effect on market prices. 7. Unconcious psychological bias - tendency to exaggerate that which is in the center of attention. For exanple, the county agents and bankers in a cut—ever region where a land-clearing campaig was under way reported almost as much land cleared in two years as the census showed for five. 8. Simple errors of estimate, such as of crop condition, or distances to town, or weight of animals - the kinds which usually com- pensate. 9. Round number estimates - often compensating in a large sample. 10. Eistakes of actual measurement or counts. Errors in'existing secondar1 data may have arisen in any of the foregoing ways, and in addition the following: 1. Recording of transactions or operations. 2. Reporting of observations — illustrated best in the case of price quoting. 3. Computations of totals, averages, etc. 4. Transcription from one record to another. 119. ~- One who uses secondary data must study thenrcqreruaavuens¢1s~ covez_all possible errors in the obtaining and recording of the data in the -first place, and in the later work upon the data. . (d) m (x) " ”The theory of samples" says Karl Pearson, " - their probable “ errors andllegit ate use - is the chief topic of modern scientific statistics."‘ 31 It is such because most of the data we work with pare only a sample out of the whole "universe" in which we are interested. We cannot take time to analyze the activities of all the dairy .farms in the world, or even in a single region. Accordingly we study d smal1 number of them. How true of the whole group is what we have- in this_ small sample of it? This is indeed a major problem of statistics.vv The case is even clearer with time—series data. All we can possibly have'in this case is a sample out of the infinite. In some cross-section studies using the data of the census enumerations..it is possible to include the whole universe, Such operations are statistical even though no sampling is involved. Sampling error, like errors in data, is a subject inads~ quately discussed in the-textbooks in statistics. ‘ The early American textbooks. like King and Secrist, gave only passing reference to it. One. has to go to Yule's Introduction to the Theory of Statistics to find a discussion of it which is at all sufficient for careful scien. tific work in the field of agricultural economics; and even this dis- 'CuSSion leayes large gaps in analysis still needing to be filled. (xxx) (”The principal difficulty with all these discussions is that they talk in terms of perfect or "random" samples only, and most of the samples which we have to deal with do not fulfill these requirements. .Ws need to know how valid are the conclusions under various practical conditions of sampling. Unitl measures are set up more applicable than those hypothecated in the assumptions of random sampling and pure chance, we shall have to rely a great deal on empirical tests. As a matter of fact, the term sampling is almost too narrow _ to include all the practices followed in the matter of obtaining a small number of cases to use in lieu of the whole. Folleudtg is a list of the practices followed in cross-section studies of the economic phases of agriculture: 1. Pure random sampling — as when the names or numbers designating all the farmers in the region are written on slips of ”paper and a given number of slips is drawn. (from the hat.) 9 (x) op. cit. p. 161. (xx) Dorothea Kittredge,EUniversity of Minnesota, reviewed'this this section for the Committee. . (xxx) The following recent textbooks have useful discussion of the problem of sampling: , Jerome, pp. 13-23 165—178. . Bowley, pp. 277-286 (Elements of Statistics). Crum and Patton, pp. 118-125,196-210. Mills, pp. 548-561. 5 5O 2. Every third; fifth, tenth or some other ordinal of the farms is checked off on a complete list of all farms in the region constituting the universe, and only these are visited. ,' 8. “Typical" areas in the region are selected and all farms in the areaimarked'off are visited except the ones rules out as outside the universe because abnormal. 4. "Typical" areas are selected and the field men work out from a central point till they think they have enough cases. 5. Like the above, except that a good many farms are omitted because hard to reach or small, or hays other activities mixed-with V farming; or because the proprietor would not give the data. ‘ii 6. Like the above, except that the area selected is sectioned or banded and only thGSe sections or hands epe'covered. ‘\ 7. The whole region is laid off in bands or sections of un- iform size and all fanns in part of these bands or sections are coveaed. at What the Bureau of Crop and Livestock Estimates call a .“stratified” sample is taken - that is, the area is laid off in die-v tricts and a certain number is taken from each district.‘ The dis- tricts may be counties or townships. The number taken may be pro- portional to the importance of the district, or the results may be- weighted on that basis in case the number of returns is not properly proportioned. 9. The stratification may be not in terms of geographic areas, but of the various sets of conditions found in the area. Thus each of the conditions of nationality, type of farming, and soil in a state might be represented in a sample of rural living in a state. The representation may or may not be preportional to the importance of these conditions. ' To. What seem like "typical" farms, or marketing business units, or rural villages, are selected. 11. An attempt is made to select a "typical" array of farms or marketing business units - some of all the important types and of all degrees of success. 12. Bankers or other local people are aSked to send in lists of "representative" farmers, landlords, etc. 13. The cases select themselves according to whether or not they respond to a questionnaire or keep the records desired or make the desired peribdic reports. ~ It is impossible to say categorically that one of these methods is better than another. The answer depends upon the objective‘of the 51 ‘ — -‘."_" inquiry. Some of the objectives for cross-section analysis..c1ass~" ifiedaaccording to pertinency on this point, are as follows: ._—-‘~ 1. To get an accurage description of conditions, as of .rural living, farmer's incomes or costs, or crops bein.g grown. With such an objective, Nos. 1,2,7 and 8 offer the best basis for ade~ quate samples. although some qualifying statements will be made later on these methods Obviously Nos.5 and 6 and 10 to 13 will not serve at all; Nos. 3 and 4 will serve poorly; and No. 9 only a little better. ' ~ ‘ é 2. To obtain a measure ofc mgg_ in conditions rather t an absolute conditions. The absolute level obtained in a sample may be too high. but the change from year to year may conceivably be normal for the whole group. 3. To obtain evidence of‘relationships between attributes or variables and to measure the extent of this relationship and the regreSsion of one on another. 4. Merely to list the conditions that exist in a region:2 for example, the leasing systems to be found in it, or the different varieties of corn. being grown. One may be interested merely in know- ing that certain plant diseases occur in an orchard without knowing exactly the relative prevalence of each. The question may properly be raised as to whether such data are statistical. I The practices in time series "sampling", if the term can 1 be applied to such data, can be briefly stated as follows: When yearly averages are used. ordinarily all the years are taken for which data are available. Sometimes, hoWever, a'cutgoff as of a given date is used in place 01‘ the yearly average. yearly, monthly and weekly figures may be either averaaos, or cut—offs on a given day or series of days. Daily measures may be "closing" price or some other cutuoff. “Average cost to packers", is an example of. a daily average. The quality of randomness is obviously lacking whenever cut-offs are chosen; but this does not necessarily mean look of representativeneSS. All 01‘ the four obieotives for croSs- section analysis apply to time series analysis, but not in exactly the some way. Si r“ It seems advisable for the sake of the later Semelingh ' discussion to restate here in as simple a form as possible the requirements of a simple _§gp_g as given by Yule. (x) It should be stated first of Illthe four objectives above named - namely an accurate report of conditions in the universe being studied. A sample adequate for this will meet the requirements of the other three object- ives also; but it may be possible to meet some of the other three with less than a "simple" sample (1) Ch. XIII "Simple Sampling of Attributes", Yule prefers the tenn simple to random because more than more randomness is required. l The general definition of a random sample is one "found in' such a way that every one of the individuals in the larger group has the same chance of being selected in the sample, and that the selec- tion of a particular individual does not influence the chance of selecting some'other individual". (x) ' Yulefs three requirements of a simple samgle specifically stated are as follows: ‘ w (1) :"We assume that we are'drawing from precisely the same record througheut the experiment." (p. 336). ‘ If conditions Vary between lbcalities included in the universe being studies, the samples of it must all be so taken as to include all these localities, Applied to periods of time, there can be no differa rgce in the period included in the samples if conditions change in any ”essential way daring the whole period being studiod.‘ The effect of violating the requirement is stated by Yule as follows: , ' f If we do not draw from the same record all the'time, but first draw a series of samples from one record, then another series from another record- with a somewhat different mean and standard—devia» tion, and so on, or if we draw the successivef _ samples from essentially different parts of the “' same record, the standard error will be greatly increased. That is, "the standard-deviation observed will be greater than the standard—deviation of simple sampling.": (p. 288). How much it will be increased cannot be stated — it depends upon the amount and the relevancy of the other variations. Yule states that it "may be increased indefinitely as compared with the value it‘ would have in case of simple sampling." (p. 348), The type of series which violates this requirement is known in the literature of sampling as a Lexis series. ' 2. :"We aesume not only that we are drawing from the same record throughOut, but that each g: our cards at each-drawing may be regarded quite strictly as drawn from the same record (or from ident- ically similar records)." (p. 336). "Consequently, if our formulae are to apply in the practical case of sampling, the conditions that regulate the appenrnsce of the character observed must not only be the same for every sample} but also for every indiyidual in every sample. This is again a very marked limitation. To revert to the case of death-rates, the formulae would not apply to the numbers of persons dying in a series of samples of 1000 persons, even if these samples were all of the same age and sex composition, and living under the same sanitary conditions, unless,fly a“ a. "" (x) Drum and Patton, p. 205. \ 53 further, each Sample only contained persons of one sex and one age. 'For if each sample included persons of both sexes and different ages, the condition would be broken, the chance of death during a given period not being the same for the two sexes, nor for the young and the old. The groups would not be homogeneous in the sense re— quired by the conditions from_which our formulae have been de~ duced." (p. 260). - Yule states the effect of a departure from these conditions of simple sampling as follows: . "If the ayerage chances are the same for each universe from which a sample is drawn, but vary from individual to individual or from one sub-class to another within the universe, the standard-devia— tion observed will be lees than the standard-deviation of simple sampling as Calculated from the mean. values of the chances." (Po 288). - From this Yule derives ihs statement which applies to stratified sampling (pp. 34e-o): "If we are drawing from the same record throughout, but always draw the first card from one part of that record, the secbnd card from another part, and so on, and these parts diI‘fer more or less, the standard error of the mean will be decreased. If, to vary our previous illustration, we had measured the statures of men in each of n different districts, and then proceeded to form a set of samples by taking one man from each district for the first sample, one man from each district for the second sample, and so on, the standard— deviation of the means of the samples so Iormed would be appreciably less than the standard error of simple sampling. The result is perhaps of some practical intores.t. It shows that, if we are actually taking samples from a large area, different districts of which exhibit markedly different means for the variable under consideration1 and are limited to a sample of n observations; if we break up the whole area into n sub-districts, each as homogeneous as possible, and take a contribution to the sample from each, we will obtain a more stable mean by this orderly procedure than will be given, for the same number of observations, by any process of selecting the districts from which samples shall be taken by chance. There may, however, be a greater risk of biased error. The conclusions seem in accord with common-sense." The type of series which violates this requirement is commonly referred to as a Poisson series. 3. "we assume that the drawing of each card is entirely independent of that of any other." (p. 336). Yule's illustration of this is as.follows (p. 261): "Reverting to the illustration of a death—rate, our formulae would not apply even if the sample populations were composed of persons 54 of one age and one sex, if we were dealing, for example, with deaths from an infectious or contagious diseaser For if one person in a certain sample has contracted the disease in question, he has in— creased the possibility of others doing so. and hence of dying from the disease. The same thing holds good for certain classes of deaths from accidents, e;g. railway accidents due to derailment, and explosions in mines: if such an accident is fatal to one person it is probably fatal to others also. and consequently the annual returns show large and more or less erratic variatidnsr"'Cx) \ A The effect of positive correlation between the drawings is to increase the standard error, that is, make the observedW:.~v standard error greater than that of simple sampling; the effect'*; _ of negative correlation the opposite. The application~of this i ' to time series data is obvious. Ordinarily the events of one ' period are closely correlated with those before or after. 'Independence is almost entirely absent. If all of these three conditions have been met in-selecting the sample, the_series is usually referred to as a Berneullian series, In series of this type. it is possible to apply the formulae which ‘ have been derived for samples, and find the limits within which sta« tistical measures of different samples in the same universe will fluctuate. Mills summarizes the result by saying: "This means that 'we may apply to the population at large statistical measures secured ‘ from the study of a sample not with confidence in their perfect ster- bility, but with fairly.definite knowledg of the margin of error"” involved in thus extending our results."€:;) Unfortunately; however, we cannot'be certain that the conditions of simple samplingthave been fulfilled in an actual case. In.fact, in the field of agricultural economic data, the circumstances in connection with the'selection of samples of much of the data make it plainly evident that the con— ditions of simple sampling could not hays been met. It is there- fore extremely important that the research worker in this field be alert to these situations, in order that he may avoid the error of temploying with confidence the mathematical formulae for probable errors (applicable where the conditions of simple sampling have been met) to situations where these conditions have not been'met; If'a succession of samples is taken from a universe, the means from the samples will ordinarily form a frequency curve of the "normal" type, 'The mean of this frequency curve of means will very closely appreximate the actual mean of the whole universe. It will . do this even though the phenomena themselves do not yield an essential- ly normal curve. The foregoing is what furnishes the basis.for using (x) Yule distinguishes between these three requirements of simple sampling by saying that the first means that the dice shall always be thrown in the same way, and the second, that the dice must be all exactly alike. The third requirement would be violated if some of the dice were left down after each ‘ throw. See Secrist, pp. 400—406, 1925 edition. (xx) Mills. page 553. 3' I , 55 Standard Error and Probable Error of the Mean as a measure of sampling error. A given figure for Probable Error of the Mean is a statement that the mean of the sample taken stands a fifty-fifty chance of being this much more than or‘lesS'than the mean of the universe from which it is taken. Increasing the Probable Error means getting a sample with a mean that is farther sway from the true mean. ‘ Throughout this discussion on sampling. it has frequently been necessary to refer to the "observed standard deviation", and a question may arise as to whether this means the standard deviation computed for the one'sample taken, or whether it refers to the standard deviation observed from successive samples. -Theoretically, it refers to the latter condition; practically, to the former. A simple derivation of the fore mula for the standard error of the mean will serve best to demonstrate what this means. ‘ Let us assume n samples to have been drawn from a universe” and the standard deviation of each to be represented‘EVOT Q5 0‘)....o (RR-t Q The standard deviation of the sums of the samples wil béog_:“7g+CK%+43uflk. The standard deviations of each sample will tend to be the same in the long run and identical with the standard deviation in an indefinitely large sample, drawn under the same co ditions.* thgrefare, since all tend to be the same, we write: =_ ,o~- = ‘ is.is the standard deviation of thSaZums, andzghgostagdard dg$?gxiggwhf the mean of the .of these n samples, will be l/n-th of this amount, Orqfilz h which may be simplified by the following stamp/77“: :Sbl-D Ora/K: , Strictly, the standard deviation appearing in t e numerator of th ”for— mula is the standard deviation for the fundamental distribution, but as that value is unknown, the value of-the standard deviation observed for the single sample is taken in its stead, on the ground that it approx- imates the standard deviation of an indefinitely large sample.**'With the formula thus transformed for practiCal purposes from the use of a standard deviation for the fundamental distribution to that of the single sample which is being analyzed, it also becomes necessary to replace the \f??, the number of samples, by the \I7V_,the number of cases in the single sample. IThe formula actually used for the standard. error of the mean.is(fia=.$:3 ,where 0‘ is the standard deviation of- '. the distribution in the ongfigample?**The probable error of of the mean is .6745 times the standard error. Obviously the less the dispersion, measured by Standard Deviation (square root of sum of squares of deviations from the mean), the less the Standard Error. That is, the more bunched the distribution around themean, the less the Standard Error. The Standard Error also decreases according to the square root of the number in the sample. * Yule, p. 844. ' ** Crum and Patton, p. 209; Yule, p. 353. *** Mills, p. 555. . ’56 There is clearly no evidence in these measures as to the universe from which the sample is taken except that of the Sample it— self. The evidence is all internal. On two assumptions, however, first, that there is considerable uniformity in the world, and second, that the sample is representative and adequate, it can be reasoned that the frequency curve and the mean of the sample should fairly ' "as.“ closely approximate the frequency curve and mean of the universe it— self. The definite figures for the extent to which they do this ‘ which are derived from the formulae, are however hypothecated in the ‘\ three further conditions of simple sampling laid down by Yule. The requirement that the sample be not too small (Yule p. 353), must not be overlooked at this point either. What figures indicate the extent to which the mean of the sample describes the mean of the universe from which the sample is taken when Yule's three conditions are not fulfilled, or the sample is too small, or when it is derived accord- _ th to each of the 13 procedures in sampling outlined above, we can— I ‘not state at present with any degree of precision in most cases. '1 Yule himself says: "It is evident that these (three) conditions very much limit the field of practical uses of an economic and sociological nature to which (the) formulae can apply without considerable modifi- cation." (p. 262). Neither do these measures take care of the errors in the observations themselves. _ The application of sampling principles to the various methods of sampling will be taken up in more detail under each of the methods of collecting data. But a few general observations are needed at this point. First,many of the studies of farms, especially those of the route type, and of marketing business units, do not include enough cases to make frequency curves. statistical averages and other similar summary expressions of much scientific value. The methods of analysis to be used are those of informal statistics and the case method, to be discussed later. The same statement must in general be made when the individual units for study are chosen as "typical“, or so as to constitute a typic- al range or array of conditions (Hos. 10 and 11). There is, however, a definite relation between sampling method and the universe to be studied. It is entirely logicei and scientific to plan to study only certain types of farms or families or marketing units in an area, and to select those types carefully. The summary\_ statements obtained in such a case, however, must be clearly made to apply only to those types designated for study, not to the area as such. Besides, there is always the possibility that theltypes ommitt— ed may be needed to explain some of the circumstances of the types selected for study. The small or off—the-road farms ommited from a survey may be needed to eXplain some of the conditions prevalent in the area as a whole. When areas are selected as "typical", the sample may be adequate for these areas, but the areas themselves are not likely to represent the whole region properly. The Probable Error of the Mean may be a fitting measure of the immediate universe of the survey, but not for the whole region. . ‘ ‘ ‘ "\"Nm , _.,.— M The requirement that the sample be drawn so as not‘tcgbe taken from certain parts of the universe is a Lifficult:.one to meet 9*w5 in geographical samples in which certain areas are chosen for study, s As more farms are included in one area, new conditions may be reachmdz— . ‘~s poorer soil, or smaller farms, or a new foreign stock. This difficulty and others pointed out above makes it highly necessary to define the universe very carefully in advance of the study. If Objective No. l, describing conditions as they are in the whole universe, is in view, sampling methods No. 1, No. 2, No. 7 or , No. 8 would almost seem necessary in any study where geographical con- ditions vary much. Any of these may fail to meet the conditions of simple sampling, although No. 7 is more likely to fail than Nos. 1 and 2, and No. 8 definitely fails in part. No. 9, stratifying the sample- according to conditions, may meet the requirements of sampling in con- siderable part; but the dangers of bias with this method are very great. All methods of obtaining cases which result in certain types being more frequently included than others - especially methods Nos. 10, ll, 12 and 13, and to a lesser degree No. 4 and 5 ~ give results which do not describe the universe. Moreover, it is not usually possible to determine with sufficient accuracy the extent to which the sampling is b1 3866-. If, however, the objective is No. 4, it is entirely possible that such methods of sampling as Nos: 10 to 13 may serve. They will even serve for many specific purposes included under Objective No. 3 - that is - there are many sorts of relationships than can be studied by appropriate methods even if the sample is not exactly representap tive. No doubt, also, a high degree of representativeness is not so much needed for Objective No. 2, measuring changes. 58 - (e) §gurces of Qata and Methodgpgf Sgcuring it. (1) The Survey-Method. The subject of the use of the survey method has been handled in the following manner. First, Professor Misner of Cornell sent out an inquiry to all the experiment stations to discover present practices in survey-taking. This inquiry is summarized in the pages following; Professor Misner has also presented a statement of what seems to him the beet survey procedure in farm management research. ’ Since its first survey of Tompkins County in 1908, Cornell has taken a total of 93?3 farm business survey records, over a tenth of the total number in United States, and has published 32 bulletins based on such records. ’ Second, Mr. Hawthorne of the Bureau of Agricultural Economics has made certain analyses of data of recent surveys, particularly of two in Ohio and North Carolina in which he participated. He has also assisted the Committee by giving it access to his records and analysis of farm business surveys as reported in the.1925 Yearbook (Table 652) and kept up-to-date in his files. Third, frequency analyses were made of the_answers to certain types of survey questions to see what evidence this would furnish as to the types of question which do and do not secure usable results. Present'Use of ProfessoI‘Mlsner.hchcmxnnarized the present Survey Method. use of the survey method as follows: Of 123 farm management projects reported to 52 the Social Science Research Council in 1927, 51 percent were classed as using the survey method, 40 percent as using route methods, and 9 percent 4 as using secondary data. (Survey method was defined as "the obtaining of information by personal visits, with the use of a definite schedule The Collection of information from voluntary or supervised records when practically no information was obtained in addition to that contained in the records, was not classed as surveyxmethod"). Of the 80 market- ing projects, 73 percent were using survey methods, 3 percent route methods, and the remainder secondary data or data of private or public records. Of the funds available for farm management research, 45 percent was used in survey studies, 48 percent in route studies, and 7 percent in the others. Of the funds available for marketing re- search, the comparable figures were 73, 2 and 25 per cent respectively. Of 50 survey studies reported as being planned by 44 experi- ment stations, 27 were in the field of farm management, 10 in marketing, 5 in rural sociology, and.8 in land economics, taxation, credit and other miscellaneous fields. 59 ggport upon Preliminary work — The following phases of the inquiry. such work were reported: 1. Review of related studies and literature on the special problems involved. 2. Analyzing all existing data. ‘ » \ 3. Discovering\supplementary sources of data. 4. Stating objectives and special problems clearly. 5. Analyzing these‘proolems qualitatively. ‘ 6. Planning types of analyses to be made. ‘ 7. Listing information needed. ,\ ‘ 8. Preparing schedule.and testing it out in the field. 9. Deciding upon sampling method. '10nder of questions on schedule: Ehe conventional farm business survey order as outlined in most of the replies was the following? 1. Name and address first. 2: Order in whigh the cooperator visualizes his business. 3. :Not necessarily in the order of enumeration. 4. Logical sequence with reference to schedule as‘a whole. 5. Questions about related infennationvglaéed tOgether. 6. Supplementary information at end of schedule. The following other suggestions were made by workers with ex- perience with other types of surveys: 1. Put simple questions first. 2. Place first the general and less personal questions, and those not dealing with money. 8. Questions which stimulate interest should come first. 4. Questions likely to be resented should come last. 60 Arraggement of material on blan.: 1. Arrange so that the blank will be as convenient as possible for the field work as well as for the checks ing and sumari Zing." E. Have a number of central headings designated by numbers with the items in tabular form and separated into blocks by heavy ruling. Hd-sh-h Supplementary sets of instructibns: where were 21 replies to this question, of which 9 said that oral instructions were given to the field men and 12 said that written instructions of varying nature were gdven field men in addition to the oral drill work that they commonly considered necessary to prepare inexperienced men fer collect- ing data in the field. Some replied that while no written instrudtions had been given heretofore, it was a mistake and that in the future it was proposed to provide each enumerator with written instructions. - '7 Jr Questionsghot best to include: '1. Those involving much calculation on the part of the I -- cooperator. / :2. Those about his financial status, mortgages, indebted- . ness, charities, personal habits and other questions which are of ultra—personal nature. , ‘8. Questions to which definite answers can not be made, for which no reasonable answer exists, or which in— volve estimates for which the informant has no basis to guide him in makin ng the answer. rSome expressed the opinion that a tactful field man can get answers to most personal questions as long as they are sensible and_capable of being answered. Both office and field forms.. The 34 replies tabulated indicate that 21 have printed field and office forms and 6 field forms only, 7 not always both. The veins of many of the survey records increases with age so that it is desirable to have. in some instances, printed office sheets on heavy paper to which the data may be transcribed.and kept as a permanent record that may be used many years later. ”If’ ‘~ . 61 .,»""' Contacts with coooerators: Some make soliciting trips qnd arrange for a visit with the cooperators at meetings. ornir: pay the cooperators a surprise visit. ~53ne institutions send out introductory letters prior to visits. Sons use tke telephone or other means of making definite appointments With oooperators. The desirability of these methods will depend upon the circumstances. Size of field crggg: The answers were to the-effect that one person Should supervise the work in the field and check the records as' they are brought in, when enough-enumerators are employed Ito require or justifiy this. If only two or three men are in the field, they may check of the work of each other. Ordinarily_two men to an auto was considered a most efficent crew for field work. (Cornell recommendS'three or four). -- Preparation of crew for work: 1. Have them prepare and work up summaries'of similar ,data.‘ ' ' i ‘ 2. Have them study publications, tabulations and completed records of similar work. 3. Drill them on method of approach in asking questions. 4. Have them listen to an experienced person taking a record in‘tho field and work with him in completing and work~ .ing'up same. , , ‘, ~ .i 5. Hare them take a record in the field and receive the criticisms of others who listen, including an experi- enced person who is supervising the work. Checking. in the field: ,It was generally considered that records should be checked for missing information and errors each day, in a central office by the leader who is to be responsible for the prepara~ tion of.the results for publication. A memorandum should be attached to each record indicating errors and omiséions so that the enumrator may have the same next day. He will then quickly r discover the questions he is getting wrong or omitting and may correct any error or supply any missing information when passing ‘ the cooperator in his travels the next day. Promptness in checks g ing recdrds is very desirable as a way of increasing the accuracy. 62 Sampling: The answers t0 this question show that little real considers» t“. tion is given to it at most stations. A few of the significant answers were the following: One man attempts to get records for one-fifth of the farms in the county in sections midway between the best and poorest. An- other prepares a list of cooperators and takes every fifth men r the list. Another simply aims at 200 or more records. .In con- trast to this, one takes 20 to 30 records in each of mamyartefin One person endeavors to obtain enough schedules so that in 6 to 10 intervals the data will show normal frequency distribution. In some instances comparisons are made with census data, as- . sessors‘ information, previous surveys or any other existing .‘F data that will help in indicating whether the sample is fairly representative of the area to be studies. In most cases an attempt is made to avoid actually "selecting" the cocperators. It is considered desirable fit most institutions that hand—picked cooperators be avoided, as these are likely to comprise the best only,and not give a fair sample. Selecting areas: The answers to this question were even more unsatisfactory. an Most simply said that an effort was made to get "typical" or"representa- tive areas. Another practice is.to make surveys where they are asked for or needed. ' Qisadvantages of survey method: Those mentioned were: 1. Not enough detail for some types of problems. v 2. Not accurate enough for close analysis. 3. Not large enough sample for some purposes. Accuracy of data: Mention was made of checking of survey records against route records, creamery and elevator records and store records and finding them reasonably accurate; but no actual evidence was furnished although it was asked for. As to bias in answers to survey questions,' mention was made of the danger of leading questions, and of the bias of some field men. One person who had worked on cost—of-production supveys reported that certain of the field men nearly always brought in the low-cost farms and certain others th e high—cost farms. Where considerable experience has been had in survey work. the feeling seems to be that field men can be sufficiently well drilled to obtain detailed information accurate enough for most practical purposes. Whether this feeling is justified of course depends upon the sorts of purposes that are deemed practical. For ex- ample. will surveys provide data accurate enough to show the most profitable rate of feeding? ‘ . ' 5 Farm Management Surveys. (1) 3:1. One of the purposes for which surveys are frequently made in farm management research is to obtain a picture of the farming of a region, - the size of farms, the income and expenditures of the farmers the different types of farming represented in the region and so forth to show the general agricultural economic situation in the region select- ed. Another use to which farm management surveys may be-put is to show features of farm organization for different systems of farming in the same area. When such is the purpose, the make-up of the schedule and the selection of the cooperators will be different than when the purpose first mentioned is intended. , ' - * Egg ' ' Efficiency in the survey method is gained by 0 Blank the use of well prepared printed field sheets. Such field sheets save time and result in q; - greater-accuraCy; Mimeogrsphed field sheets ' ' »‘ are less costly when small numbers of records are to be_taken, but the bulk-of the'mimeographed sheets and extra space required for filing are sufficient disadvantages to Justify the expense of_prihting the field sheets-whenever a'few hundred records are to be taken." Occasionally the same.form is used for both Office record and field sheet.‘ This however has the disedvhntage of not furnishing enough free Space fbr making'notes or chlculations when the record 137 being taken. ‘ -‘ 4 Ordinarily records taken in the field should be copied at once\onfonns that_are used in the office. It is very important that these forms be printed on a heavy linen ledger paper to stand the wear and avoid wrinkling that may come from use in the office; ' ‘ Office forms should be in much detail, and provide for plenty of information. In preparing the blank, sometimes forms may ‘ be cut and pasted together,and several trials and adjustments shouldv be made until the best form and arrangement is reached. Some time spent inzfirranging the blank, deciding on the width of the columns to be allowed, and the arrangement of the informstion will facilitate transcribing,completing and tabulating the record. Copy sent to the printer should be of exact size required, should be written with indie ink on one side only of heavy manilla paper, and spaces indicated ex- actly as desired on the printed'form. Copy'for each page of the form should be prepared separately; Attention to some of these de— tails will result in a good blank. (x) Abstracted from paper by Professor E. G. Misner, Cornell university, avoiding duplication with report upon the inquiry. H ' 64..-, sMaps should be provided for all areas surveyed. Soil maps 'showing the contour lines are most useful in farm management surveys. On such a map can be left a permanent record of the location of each schedule included in the study. Organization ' The number of enumerators desirable in for field ‘ the field party or crew.depends upon the work type of schedule, the total number of cooperators to be visited, the'distance between cooperators, or the proximity of the farms to some convenient community center. Ordinarily three or four men to a car prove an efficient field crew for most types of farm management survey work. When studies are made of businesses that are Widely separated and require more travel between, one man with a car ‘53 more efficient. Whenever a large field party is to be used in a survey, it is very important that the enumerators be well drilled and trained in the taking of records before starting work. This means as a very minimum that they should have taken at least one record in drill work, made the trasnfers to the office sheet, and completely worked up the office form so that they will be familiar with what is de- sired. Meny enumerators have done survey work in the past with even less preparation than this, with such results as one would expect. The one who is to write up the results or who has had experience in survey work or preferably both should be in charge of the party and check the work. The drill work should take up such points as the way to approach the cooperators in an easy solicitous style, the influence of the method of asking questions on the bias of the answer, and ways of detecting errors in making enumerations. Records should be checked at once for omissions and mistakes. When more than four men are in the field, one man should usually stay in the office, plan the work, Check the records and keep things moving. If the information needed is not obtained” the enumerator should be‘ send back to the cooperator the next day,provided the omission or error is of sufficient importance to justify the expense and trouble. If it is only a minor point, it may be obtained by correspondence. Some system of checking records in the office, such as attaching a sheet of paper with a notice indicating the omission or errors in the field sheet, and checking these off one by one with the enumerator as the corrections are made, should be followed. Suggestions The cooperator should be approached in about asking an easy style. Every effort should be Questions made to get him in a state of mind and so situated that good work may be done. If others, su.ch as neighbors, are present, it may be unwise to proceed with the record. Generally it is desirable to avoid the family circle, altho often other members of the family may be able to assist in giving more accurate information. Usually, how- ever, they interfere with the facility of taking the record and its -_- ‘_ . ,I 65 accuracy:“" Enumerators should take plenty of notes. A definite system of signs and abbreviations should be adopted to help the enumerators proceed more rapidly with the questioning. In all survey work, the. rules of good bookkeeping should be observed. Decimals and figures should be aligned. This is mostly a matter of habit and can and will be followed on the field form as well as in books or on office forms by those so inclined. Persons careless with figures had better choose other fields of work. ' x The order of questioning need not necessarily be the ordor‘on the blank. A good enumerator will soon develop the knack of taking down the information from much of the conversation that is carried on. In fact, the more nearly one can get away from the cold formal method of blunt questions, the more successful one will be in field work of this kind. The best time to approach a farmer is early enough in the morning so that he has not statted the morning field work, or Just after dinner. ' Questions should be stated directly and specifically to speed up the work. The enumerator should assure himself as much as possible that correct answers are being given, that the farmer is neither trying - to make too good an impression. nor to understate his earnings. Tables of supplementary information should be carried in the field to assist enumerators - silage tables with the capacity computed for various silos. and so forth. Livestock records should all be made to check as to numbers. However, one of the most important things influencing the accuracy of a survey record is the judgment which the enumerator uses as to the amount of time to be devoted to different parts of.the record. It is unwise to spend too much time on inventories. One had better spend that time in obtaining expenses accurately from books or in get— ting the receipts more accurately - items which will have a decided influence on the net returns of the business. When a study is being repeated in the region, ending in- ventories for the previous year will serve as the beginning inventory of the succeeding year‘s record. This makes it necessary to obtain only the and inventory in a continued study. The depreciation on horses or machinery as well as the changes in the_inventories of cattle may be obtained in this manner. Market veins changes should g be obtained, and indicated separately on the record, but not included in the computation of the net returns for the business as a whole or for the enterprise. ' Prices and values should be obtained for each farm.~ The supervisor of the party should watch out for the tendency on the part of some of the enumerators to get prices and values for one farm and insert this on all succeeding records. This destroys one of the main purposes of the study, namely, to get a mean value for various items on the schedule. To do this accurately, replies from all cosperators should be obtained. The enumerator should guard against the thwarting by the cooperator of his efforts to obtain such figures. He should stick until the last goose is hung. and avoid making any leads or suggestions that will be likely to result.in an answer which was; ‘1 66' really suggested.by the enumeretor.(x) Organization When the records are finished in the for Office ' field, the next step is to check in _M Work the office the parts of the record whtCh ‘\ are necessarily compensating. If the cost of a new barn is included under farm expense, part of it if not all should appear as an increase in capital. If, on the other hand, a barn was;destroyed by fire during the year, the insurance received may appearfiunder miscellaneous re- ceipts. but it should also appear as a decrhase in the farm capital. .This means that usually the person in charge of the survey must check ‘the records in the office for offset items of this nature. If any of the office force are sufficiently familiar with the work and the nature 0: the items to be checked, it may be desirable to trust it to them. After the work is thoroughly and completely checked in this manner, the extensions should be made. It is desirable to use a diff- erent colored ink for such work. This means that less of the check- ing will be overlooked. The work of one person in the office should be checked by another. It is always a good plan to keep the less ex; perienccd clerks at calculating and trust the checking only to those who are more experienced. Some have found it best in farm business analysis work not to complete the financial summary of the record until some sections of th record have been tabulated, especially those sections where a slight mistake or an error would directly ' affect'the returns. This may save some tine in enusing the figures and making corrected summaries. Two kinds of checking are always desirable in work of this kind - inspection"hecking and machine checking. Most persons are fairly good atEachinc checking. The ability to detect errors by the inspection ' the data is less well developed in all of us, but an ability that hould be encouraged because it is one of the best means of expe ting statistical work. Just running over the data in a thoughtful _' will often locate the mistake in a tabulation. Misplaced figureagand misplaced decimals are the most frequent of all errors. Many bf these can be very easily detected by inspection checking. . aniliarity with short cute is desirable. Many persons are still calculating interests at 5 per cent by multiplying by .05, instead of the easier and more rapid way of dividing by two and pointing off one place. ‘ Making ' After the records are completed and Tabulations ready for tabulation, the first thing of importance is to decide on the re- ocrds to be included. Any incomplete or freak inaccurate records should be (x) The question has been properly asked by members of the committee as to whether figures obtained in this way have enough validity to warrant getting them. 67 omitted. Considerable care should be taken in deciding on the schedules to include in the tabulation or final summary. Once an undesirable record is included, the averages of various groups in the subsort prob~ ably will be greatly disturbed by this one record. The next step in making tabulations is to decide on the. sorts and subsorts to be made if the cross-tabulation method isto be used, or on the items to be punched on cards, if this method is used. Ordin- ~ erily most farm management surveys should be tabulated completely in one or possibly two sorts. A trial tabulation should be run bifore do. ciding postively on the main sort to be made. , ~ ' After the sorts and subsorts are decided upon, the tabulsp tion sheets should be prepared or the cards laid out for punching; Prior to the preparation of the tabulation sheets, a list of items to be tabulated should be prepared as a guide. Proper headings for each occurrence should be left on the tabulation sheet. The headings should be in the same order as they appear on the blank. They should be properly ruled and underscored to indicate the cblnmhsto which each heading applies. Doublecolumnsfbr loss or gain. or for-plus or minus . quantities, should be carried to faciliate the summarising of the date. A good margin of at least two inches should be left at the left of each tabulation for binding. Tabulation sheets should ordinarily not be spliced, but the tabulation carried over to-a new sheet on which the" heading should be repeated. In. drawing off the figures, the trans- fers should be aligned in nest, clear figures. Cerelessness in tab- ulating is costly and is to be condemned from every standpoint. When the value is zero, a cipher should be put in the column; When there was no report, it should be so indicated by a dash. 'Report of Tests of Survenyaterial. nggling ' Table 1 following shows what is gener- ally recognized, that the farms included in surveys tend to be larger than the average.‘ In the South Central group, this is accounted-for partly by the in- cluding of plantations as farms in many cases; in the Mountain group,, partly by the fact that ranches have been seleCted for-survey study more often than farms. (Sugar beet farms have been omitted from ‘ the table). The differences in the other’sections are about the ”usual ones. The New York state surveys average a fourth larger ' than the census figures. ~ ’ - \\B-' 0,. r-H'. Tablevl.» Comparison of Averages of Survey Farms with Those /;’ of the Census for the Same County at Nearest ‘A-N Censue'Date. I : Number : Number : Average Size_of Farms : of : of : .- : Farms 3 Surveys(t): Surveys (2) : Census (3) Total ; 36879 i 379 i 223 ; 148‘ >39w England ; 3317 ; 49 g 151 g 105 Middle Atlantic ; 8595 t 49 t 120 E 94 East North Central 2 5252 E 78 ': 141 : 110 west North Central ; 6504 i 76 K: 347 ; 265 South Atlantic ; 5474 g 50 i 149 i 86 East South Central ;’ 1215 ; 11 l 267 g 97 west South Central ; 1173 i 11 E 316 h 117 Mountain ; 2736 g 27 ; 315 ; 269 Pacific 2 1713 i 28 ; 201 .;. 194 (1) Surveys of highly specialized farming, as of sugar beet growing in an irrigated section, were omitted. (2) U. 5. Dept. of Agriculture Yearbook 1925, Table 652. (3) 1910 Census compared with surveys in 1910-15. ~ 1920 Census compared with surveys in years 1918-20. 1925 Agricultural Census compared with surveys in years 1921-24. Census figure used was the average size of farm of the county in which the survey was taken. 69 Tame LL. Comparison of Values of Farm.Propert;r of Survey Farms with @1080 of the Census of the Same Cmczfly-‘M the Nearest Census Date. Averégc Value of All Farm “Kare. ' : 130. of Pro-par by - Pe'r Farm : Inclusive ': Sufvcys 2.”. Survey : Census (X7 New England § 1997-1915 § 35 : $ 7,986 § $ 4,742 Q 1918-1920 Q 2 '2. 14,517 2 9,332 : 1920-1924 : 12 f 11,908 i 8.620 Middle/Atlantic 1 1907-1915 : 37 : 9,584 : 5,910 ‘ : 1018-1920 : j2 : '18,188 Q 12,918 E' 1,21-1924 9 .5 § 14,044 _; 8,679 East North céntral I 1907-1915 g., 48 ; f21,877 ; 11,814 } . ; 1919-1920 § 15 1 26,757 Q 22,284 , : 1921-1924 2 20 :7 27,954 Q 13,424 What North 0e8tra1 § 1907-1915 : 54 :' 23,531 1 14,937 ’ ‘ Q 1918-1920 Q :6 § 52.022 Q 35,452 Q 1921-1924 § 16 § 41,841 2 27,408 . ‘? f 3 3 South Amlant1¢ Q 1907-1915 § 13 § 12,925 Q 3,877 ‘ ; 1918-1920 ; 19 : 15,179 g 4,809 § 1921-1924 § 9 § 9.159 2 5,408 Best South Caitral 2 1907-1915 Q 3 E 33:655 : 4,388 E 1918-1920 ; 6 § 28,163 ; 3,758 Q 1921-1924 ; 1’3 § 16.469 Q o5,994 r...- 70 Table II. - Continued. Average Value of A11 Farm gYears ; No. of ; Property — Per Farm : Incl...ive : Surveys : Survey : Cenwsfl) : : : z ._ West South Central 3 1.9ow-1915§ '7 Q $ 18,811 Q 3; 5,751 Q 1918-1920; 2 - E - Q 1921-1924: 2 27 .793 Q 6, 521 ~ : : 1- : Mountain : 1907—1915: 19 : 13,793 ; 8,807 2 1918—1920; 5 Q 38,008 1 23,600 ; 1921—1924; 4 ; 26,792 ; 10,921 ggcific 1907—1915: 10 16,928 11,057 ; 1918—1920; 1 : 51,200 2 51,810 ; 1921~1924; 5 ; 22‘116 : ‘ 12.254 (x) Same comparisons as in Table I. \ o\ 71 In Table II, the same comparison is made in terms of values of farm property, with sub-groupings according to the nearest census date so es to make allowance for changes in the price level. The discrepancy is much more pronounced. (That for the 1907—15 surveys is partly explained by the fact that most of the surveys were made in 1914 and 1915, and land values had risen a. good deal since 1910). This suggests that the fame covered in the surveys in many areas are better farms with probably 9. larger percentage of improved land. Table III gives the same date. in more detail for three such surveys. In Clinton County, Indiana, the survey and census figures were closely alike in most items. In the Virginia county, the differ- ences are very large for most items. ‘ Difference in system of farming seems to ~thigure in the Delaware comparison; although the survey farms are practising the more extensive farming, their land is the more valuable. . ' I Table III. - Comparison of_Survey Data with Census Data for the Same County in the Same YBar. 50793'“" 1- : : :Newdastle,fiel. 1 :Clinton 00. 1nd,:Frederick Co.Va.:Middletown Area - : 1919 : 1919 1 1924. _’; :Survey : Census: Survey : Census: Survey J:Census : : i 3 : 1 1-1.. , i x : : : ‘ : _. . N0..of farms 1 100 : 2399 : 125 : 1725 : 55 :;,1957 . Acres per farm : 130 z 104 : 159 : 133 : 215 : T107. .¥ 1 : : : 2 2 Céapital per farm - Total :332318 :$27037 :$33676 :911151 :$22832 :912785 Land '2 25573 : 21074 : 25301 : 7433 : 13360 : 5715 guildings : 3129 ; 3054 : 4010 : 2224 : 4944 : Maphinery : 484 : 976 : 588 : 545 : 1105 : 94? Livestock : 2575 : 1933 : 1572 .: 959 : 2432 :{,1044 Iarm supplies : 750 : - : 334 : — : 649 2" J gash to run farm : 106 z — : 821 : - : 342‘ 1 ~ Yieid per acre : : : : : . : Corn bu. : 51 : 47 : 34 : 32.3 32 : 30 Wheat " : 42 : 42 : 14 : 12 1> 23 : 21 eats " : 20 : 20 : 22 : 17.: ‘ r : Rye " : 15 : 15 : ll : 8 : _ E : Hay tons : 1.0 : 1.1: .9 : 1.0: ‘1.1 : 1.2 Acres in - : : : : : ,: Oorh' : 45 : 31 : 16 : 12 : 38 '1 15 Wheat : 14 : 15 : 23 : 15 z 67- : 19 Oats : 22 : 15 : 1 : 2 : - ' : Rye : 2 : 1 : 1 : 3 : - : gay : 29 : 14 : 15 : 9 : 25 : 15 , : z : : z : 3 : : : : : t .' 2 2 t i 2 3 ’ : : t : 3 1 73 The most interesting comparison is that between the 1115 ; farms recently surveyed in North Carolina and the census average ’ for the state in 1925 sflbwn in Table Iv. This survey was planned to sample the whole state. First the state was laid off in four districts, the Tidewater, the Coastal Plain, the Piedmont and the Mountain regions. Then either three or four counties were marked off in each region as presenting the best examples of certain sets of farming conditions. Within each county, the sampling was done by sectioning. The survey farms are almost twice as lemge in total acres, although only a half larger in crop acres. In value of real estate per acre, the two are nearly identical, but the survey valuations of buildings were considerably higher per acre. This is probably due more to the differences in the two methods of getting the valuations than to anything else. 74 Table IV.~ Comparison of 1928 North Caroline Survey Data and Questionnaire Data with State Census Averages. : : : Federal :1 Census : Survey : Q : : Questionnaire number of farms — ::QBQy884 : 1115 : 380, 81 e : : Crops :t 24 : 37 z - ‘ '5 : : Pasture - cleared :1 5 : 9 : - “ 2, : : Pasture — wooded :, 4.8 : 10 : - Woods : 30 z 54 : . Other Land : '6.5 : 18 : a. All land : 65.6 : 128 : 157 Land : $ 2.421 : $ 4,600 : ( : : : ( $ 6,931 Buildings : 845 : 2,059 : ( Livestock : 273 : 404 : 514 Machinery :. 163 : 190 : 348 Feeds ‘A - : 167 : 344 :r : : Total x: 3,704 : 7,430 : 8,137 :i : : Farm Ezgenses: :3 : : : (1) : : Feeds : 80 : 71 : 82 Fertilizer : 134(1) : 203 : 188 Labor (Wages) : 116(1) : 124 : 296 (1) Per farm for the farms reporting such expense 75 The comparison in Table V is between the survey averages and the county averages for each of the four regions. (The questionnaire data will be explained later). The counties chosen average 65.9 acres per farm as compared with 65.6 for the state as a whole“ Their land averaged $51.53 per acre in value as Compared with $49.80 for the state. In the Mountain region, the survey farms were the more valuable per acre, and in the Tidewater region the "everse was true. The difference in size of farms was greatest in the Caestal Plain Region. ‘ ‘ ~- Table V.— Comparison of Survey and 1925 Census Real Estate Values for the Same Counties by Districts of North carolina. w o v : Number of : Acreage : Value of : Value of : Farms : per Farm :Real Estate : Land Districts : : : per acre : per acre : : 1925 : : 1925 : : 1925 : : 1925 z:Survey:Census:Survey:Census:Survey:Census:Survey:Census : 1115 : 42541: 128 : 65.9 :$52.10:$51.53:$35.94:$38.88 ~'L I t v I 0' 0 t 0 'Coastal Plain Region: 294 : 15382: 136 : 56.6 : 65.70: 65.95: 45.06: 45.94 Mountain Region : 281 : 9129: 12:5 : 76.4 : 40.80: 54.45: 30.02: 26.69 Piedmont Region : 311 : 9912: 109 : 74.2 : 55.85: 50.71: 36.34: 37.40 Qidewater Region : 229 : 8118: 150 61.5 : 43.82: 51.45 30.81: 38.78 (x) This survey was conducted jointly by the North Carolina Tax Commission, the North Carolina Experiment Station and the U. S. Bureau of Agricultural Economics. '7'?" In Table V1.13 presented a. frequency distribution of farm acreages according to the three systems of collecting data. It is the small farms that are omitted, especially the very small farma, in the surveys. ‘ Table VI.— Frequency Distribution of Size of Fame in North Cerblina Accord- to Three Medthot! cf Collect— ing Data. .1- Manner ’of Farms 2 Percentage : > ‘. :Question- :. : ' . :Question— : Census : Survey : naire : Census : Survey :. naire Total £‘283.284; 1115 Q ‘ 380 Q ”100 Q‘ 100.. 1-4100, Under 20 acres ; 55.047; 41 Q' 10 E 23.2 § 3.7 Q 2.6 20 - 49 ; 93,817; 214 Q 43 § 33.0 § 19.2-: 11;3 1 50 - 99 ; 58,1291, 333 i 91 : 24.0 2 29.5 ; 23.9 10029174 2 ’37;611§' 313 Q 125 ; 13.2 ; 28.1 E 33.1 175 -259 ; 10.543: 115 Q 54 i 3.7 Q 10.3 § 14.2 250 -499 i 5.525: 59 ‘i 45 i 1.9 i 5.2 i 12.1 500 and over. ; 1.710 ,.30 ,; 10 Q .5 E *2.8 E 2.5 o. .- on go no - ”a... .. - -. . - _ — ._ ._,._..__--.. . 78 "_ ‘r\_ a \; \\‘ This raises the significant question as to shnther the census definition of a farm is suited to the purposes of'guch of the analysis of agricultural problems. ‘It is aimed largely at securing a full count of the crop acreages and production, as no doubt it should be, rather than at describing farming itself. In most bf these surveys, no conscious effort was made to get farms of larger than average size. Instead, everything that looked like a real farm was taken as it was reached. The farms omitted were mostly those of persons who earn much of their living at something 6180. In the mountain regions, many of such flamilies were living on a largely self—sufficing basis; in the Southern states, largely in the same way, or by occasionally working out. ‘The census fre— quency distribution therefore includes a large group of people who do not stand as farmers in the eyes of those familiar with the prob- .mlems of farming. The survey distributions may much more nearly ‘approximate true agriculture. The lesson to be drawn from the foregoing is not that the survey distributions are not fair samples, but that we do not know in most cases whether they are or not. A more conscious effort must be made in advance of the Survey to define the universe exactly, to 33y f0? each31rvey exactly what shall be included as farms and what not, and to publish these specifications in the reports. of the surveys along with information to show something about the farms that have been omitted. It would be well to collect a little in- formation about the farms that are omitted - acreage, principal sources of income. both from the farm no outside, etc. The conclusion is obvious from the foregoing discussion that present Sampling procedure in survey-taking is not altogether satisfactory. The method of working out from some central point until-a certain number of records has been taken, results in getting too few farmers living remote from town. It also raises uncomfort- able questions as to whether to work another day or two and reach out into conditions apparently different from thOSG encountered thus far in the survey. Likewise the method of following each of the V main roads out from town fails to get enough of the farmers living on cross roads. There is alwirs danger in omitting farms which do not seem to promise a good record or which are abnormal in any particular. The proper sampling procedure depends upon the objective of the survey, principally upon whether its purpose is to obtain data describing conditions in a large area, or instead, to study the relationships existing in certain type of farming. If the objective is the former, it will be well to take definite geograph— ical units such as townships. If a larger area than an single township is wanted, additional townships can be taken according to some checkerboard arrangement. Another'possibility is to-aelect sections (640 acres) checkerboard fashion. A still more satis— factory procedure, except that it involves more trouble than the others, is to check off every 5th, 10th, etc., farm from a complete list of farms in the region to be studied. It is not necessary for such an objective that the study _~ ~Ebouldeinclude every type of farm in the region, It can be defin- italy agreed in advance that rural mail carriers and others earning most of their living from other than farming operations shall not be included. In studying farm families it can be agreed that bachelors will not be included. ‘ If the objective is of the second sort, namely, to study the relationships in a given system of farming or set of conditions, the important consideration is to locate definitely in geographic terms the area in which this system of farming occurs in the form'desired, and then Sample this geographic unit; It is entirely proper_to mark off a small area on the map_and to include every farm in it except those which are discovered in the course of the survey not to be prac— tising this type of farming. . When the results of the survey are published, it should be made clear that they relate to the situation as found in this limited area. There should, however, be enough study of the larger area from which it has been selected so that safe statements can be made as to the more general application of its con- clusions. V . : The method of selecting areas on the basis of some plan of stratification according to conditions, is almost necessary in certain types of study; for example, a study of land settlement conditions in a region so large as the Great Lakes states. It calls for a very careful preliminary study of existing data with respect to the region, and probably, in additionga reconnaisance survey; and-all the differ- ent conditions and combinations of conditions that occur in the whole area should be listed and placed on the map. Proceeding along these careful lines, it is possible in most cases to sample a large area without any bias, and to include all the different conditions in some- where nearly the proportions in which they occur in the region. It is not altogether necessary that the proportions shall be the same, but if they are not; an average or summary of the figures must be properly weighted. ' ' Samples need to be large in area studies in order to give mi; all the various conditions existing in them their proper chance of being included. The various tests of sampling, such as probable error of the mean, are not likely to be a very safe guide when ap- plied to most of the data in an area study, because the taking of the cases does not comply with the first condition of simple sampling, namely, that all parts of the record shall be included in each sample. What happens instead is that as the sample is enlarged, the field man reaches out into new sets of conditions. ‘ The practice should be followed more generally than at present of having the field leader test out the sample as to size as the survey proceeds. so .1 / “NM , When the records for the 800 iwflwfi; §\ ‘ in Vinton‘\Jackeon and Meigs counties in Ohio , \<; in 1926 hadfbcon worked up in the office, it \ seemed that too large a. proportion of them '"77 " " "N'. showed consiaerggi: larger cash outgo than °35h~1n°°me. Accordingly 73 of the fami s were ro—surveyed; that is, the same field record was taken back to fire farm and checked over item by item. Table VII shows the results of thim- ~survoy. Of the '73 farms, 40 had their balance of income over outgomeised an average 'of $375. Some of the miscellaneous items were board and. lodging, interest on government bonds, dividends from stocks, receipts from operating a bus, notary‘s foes.- The extra poultry receipts were largely from turkeys, most omitted in the first survey. Table VII. Results of Ila-survey of 73 Ohio Farms. 1" : Decrease ” InoreeSe3 :No. : Total : :No. :Total : . :of : :Avera’gpxof : :Averagc . :Farms:Amount : :Farms :Amo'unt : ' " . 3 : v t ’ Cash income : 1 : $120 $120 : 38 :$15, 059 : $396.20 Cash outgo : V, : 135 : 33;?5} 2 : 285: 147.50 Cash income less 011th : 1 : 120 : 120 : 4O : 15.008: 3715,20 Sources of additional income. No. Amount Work off' farm 14 $ 5.960 Insurance 3 _ 8,230 Pension 3 1,440 Cattle 5 1,127 Coal mined on farm 2 1,000 Poultry '7 759 Misc. 5 1,068 81 How a re —survey bf the rest of the zoo farms would have changed the totals can only be conjectured. It would appear from the nature of the changes in the 73 that it would have raised the cash receipts of many of the remainder, though probably nowhere nearly so large a proportion of them. No doubt also some of the larger surpluses of receipts over expenditures are due in some measure to omissions on the expenditure side. The omissions found in the Ohio ro—survey were mostly in the group not classified as_farm business income, with which most field men have had little experience. No doubt Mr. Hefithorne's- long experience taking strictly farm'business surveys may have in- / clined him to overlook this other group. It must also be added that Mr. Hawthorne is one of the most experienced and most careful ' field men in the United States. ‘He has worked on many thousands of survey records in the office and knew in advance what errors, field men make on the regular farm business schedule. The majority of field men are far less prepared for the regular farm business survey taking than Mr. Hawthorne was for the work on other sources of income. v In surveys of several farm families made by Dr._C. C. Zimmerman in Minnesota in 1925 and 1926, in which tote; receipts of the farm family from all sources were bilanaeed against ex- penditures of all kinds,=rven though for single years, only a.small. percent showed expenditures appreciably in excess of receipts, al— though several of the areas were depending upon cash crops to a con- siderable extent. The balancing was roughly done in the field when the schedule was being taken, and checked over carefully later. Follow— ing is a clussification of the reasons for the excess of expenditure over recei t in the 56 cases. The excess in mest of. them was rather small. TEe agricultural depression figured in many of them: Farms still in process of development......................;.........10 Interest on heavy mortgages and poor or only fair incomes...... ..... .10 Crop failure or low prices or both, and inability to reduce expenditures to low enough level.................. ....... .........10 Purchase of automobiles, trucks, lighting plants, or house improvements, out of One year's income..... ......... ..... ..;.....14 Wheat crop held unwisely............................................. 3 Slump in prices of purebred cattle............L..................... 3 Extravagant living, not yet adjusted to lower income....... ...... .... 3' Miscellaneous.................................................;...... 3 It seems apparent that the checking of complete farm family expenditures against complete receipts would aid to the accuracy of the farm business data and would be highly desirable if it did not make the schedule so long. . No doubt the item which one would expect to get most accurm ately in a survey is taxes. Table VIII summarizes a comparison of the survey returns with the reports in the auditor's books for the 1115 North Carolina farms. The survey reports run about 5 per cent higher. N . , 82 " ___,\ 1 \ _‘ A total of 405 farmers reported taxes averaging $ 25.58-mcrev and a total of 224 reported taxes averaging $19. 90 lower. It is signifi- cant that 233 were too high or too low by 20 percent or more, and 157 by between 10 and 20 percent. Professor G. W. Forster makes the following report upon '- these discrepancies: - 1. In most instances the farmer reported with his tax the poll tax and dog tax while the auditor gave us only the tax on prop— erty. In checking over the results in.tuh ecunties, we found that by elimating the poll tax and dog tax that the differences \ were reduced from 5 per cent to 2 percent. 2. The land upon which the farmer reported tax was often different from that reported by the euditor.-e 3. occasionally farmers pay two years' taxes at the same time; 4. In seme cases the farmers reported the taxes incorrectly. For example, in one inStance which I have before me. the farmer re- ' ported $133 in taxes when the auditor reported $99. In explain- ing this. the farmer stated that the $133 was. for 1926 and the $99 for 1927.. . 5. A drainage tax was often reported by the formerr 6. 'In a few cases the records were taken from individuals who knew the farm business but were not informed in regard to the tax. The accuracy of the survey results varies according to the education of the darmers and their practice in the matter of record , keeping. In_many areas, relatively i‘ew keep records of any sort. In Hillboro Cdunty, Florida, 44 out of a hundred,_representing 54? percent of the farm recei.pts. kept records of receipts and expendi- tures; in Polk County, Florida,. 53 out of a hundred, representing 64 percent of the crop receipts... These farmers were fruit growers in Frederick County, Va., 28 kept records.' These are recognized as exceptionally high percentages. ‘ I , There is need fdr more concern over the accuracy of survey results than is géherally‘felt.. .The tendency is to assume that since the method has been widely used it is dependable. There is abundant most favorable conditions. Granted that the method is dependable used by institutions with long experience with it on standardized schedules, this does not mean that it is when used in other institutions or in new fields of research. Inexperienced field men always let many mistakes slip into their records. More attention must be given to drilling field men in advance. They must be watched closely to see that they do not ask leading questions,'end that they do not handle the questioning in such a way as to let personal bias influence the answers. The leaders of field crews must cheek the schedules more 86 Table 1x.- The Deviation of Farmers' Tax Reports from County Auditors' Books -— North Carolina : : Dev1atron of”Fhrmars' Reports from Counfiy Audltor's Books :Average: Farfiur storfs I‘Eeiow :: _—-.—Ffi?mer FEfifiFFE — A ove : Taxes :NBTEf: T5$EI"‘TKVErag8:" ::N0.of: TFfEI‘Dé~: IVEFEEe : Ear- County ': all :Barms:Deviatn :Deviat—zLar-::Farm-:viati0n :Deviation : gest ;Farm8rs:Rep0-:jon for :ion for:g8st:: 8rs :for Fann- :for Farm- :Deviat- . :rting:Farmers :Farmers:p8v—::Repou:er5 Repor-zers Repor-: ion ' : ':R8port- :Report~:1atn::rt1ng: t1ng : ting : f f 5 15g .5 lng .3 53 3 5 E Total 23101.51; 224 i_§4,459 3 $19.90; 23 405 E 810.361 3 ‘$25.58' E Ashe § 110.189 .22 i 187 § 8.505 838:; 40 i 726 in. 18.10 §_8118 ngden g 128.11§‘ 10': 534 g 53.40: 203;; 10 i .268 E 26.80 E' 10 Qgfigwba g 75.19; 23‘: 267 § 11.60? 64;; '25 3 204 i‘ I 8.15 E 25 gggberland § 117.19; 24 § 555 :f_§§.12§ 243?; 24 i 1.569 E 65.37 E :24 Davidson g 56.42§”'35'§ 434 i;;12,40§v 62:; 31;? 715 i 28.06"§ '81 Hoke § 114.605 5 g 119 § '23.90? '65:: ' 9': 79 g 8.77 g 9 Jackson g 68.91; 10 E- 108': 10.80?‘ zoiél 27 § 232 Z . 8.59 E 27 ngoir g 185.63;__10;§ 1.069 § 106.905 573;: ‘48 i 1.554 g 32.37 i 48 Méggn g 62.10% ~ 0'; _ o i '- 0?" 0:5 0 j. 0 i‘ 7 0 g 0 McDowell g 65.01? I 6,: 44 i 7.33: 203? gng' 645 g 20.17. § ' 82 MQQre g 88.30%_ 11'; '171 §“'15.50§ 915; .13 i 110 § 8.46: g 13 EfiggflL § 277.305' 12 E 150 :fi_12.50§ 65:: 6‘: 218 § 36.30 g '5 6 Eergumars 1;;1oe.62§ "7 g 83 §' 12.42: '31:? 9 I __187 ifiigxgeo g' 9 Egrsons g 82.14; 9 E. 90 g 10.00: 26:: 47 E 925 g '19.68 g 47 Bgnder g 85.87; 23 i 895 § 17.07? 176:: 50 E 958 i 19;06 E 50 HEEflEL Efilgggeog 9 § ‘ 100 § 11.11: 40:: 14 E 213 § 15.21 i 14 thwan 2 166.80; 20 § 149 § 88.62: 51;; 20'; 1.768 g 88.15 i .J20 % H 8’4 »‘rigorously, and compare the returns of the different field men. Doubt- ‘ful interpretations or a_v?lications of questions- -should be settled once for all as they arise. ind adhered to thereafter. Less stress must be placed on number of schedules aer day and more on accuracy of re- sults. Some field men become "record hounds".A11 these precautions are hivhly irportant with new types of surveys, espe cia1.ly these re- lating to interpretation of questions and instructions. The accuracy of survey data, however, is largely a matter of the nature of the data. The accuracy is relatiVely high on such items as taxes and major cron receipts and major expenditures, and relatively low on a large list of minor items. No doubt much individual bias enters into questions involving judgment. Consequeitly, the further d.iscussion of accuracy is best taken up in the next section connectiOn with the different types of questions. Analysis There is no doubt that questions can be asked of ass of which call for a definite answer that will be Questions: very nearly correct. The following questions proved, upon use, to work out this way: The first-payment on his farm by the settler in the cut-over region. The amdunt of his cash on hand at time of settle— ment after making the first payment on the farm. The amount of money contributed to church during the year. The amount of the average weekly meat bills. The first two of these represent events standing out very clearly in the memories of the persons involved. Church_contributions also seem to make a definite impression on the brain substance. -No doubt in some cases the amounts were calculated on the basis of weekly contributions of small amounts. 3 table-of weekly meat bills shows that these were frequently estimated rather closely. Most housewives will be able to make a very good estimate on such an item. Frequency Distribution of Frequency Distribution of Church Contributions of Average Weekly Meat Bills 125 Families: of 190 Families: Nothing 42 families $ .75 4 families, $ 1.00 8 410— 1.00 14 ~do— 2.00 20 -do- 1.50 19 -do— 3.00 11 —do— 1.75 8 ~do— 4.00 11 ~do— 2.00 21 —do- 5.00 21 ~do- 2.50 20 ~do- 6.00-10.00 6 -do- 3.00 24 —do— 11.00-15.00 2 —do— 5.50 17 ~do— 16.00—20.00 2 —do— 4.00 19 ~do- Total 123 families 4.50 9 —do- 5.00 12 —do— 5.50 3 —do- 6.00 8 -do— 6.50 3 —d0— 7.00 4 -do— _ 8.00 2 —do— 9.00 2 -do— 9.50 2 —do— 12.65 1 —do- 17.43 1 -do— 11.00 1 -do- "" fit? 190 F'smi 1": es "’85 , A question of somewhat similar nature that obtained useful and significant results was the following: "How much has the sale price of wild land risen immediately around you since you settled here?" There were 48 settlers who specified definite amounts ranging from $2.00 to $25.00 per acre, about an equal number who specified no increase at all. 10 who said that it had doubled. and 10 who said that they did not know. Upon correlating these answers with the number of years that the settler had been in the new country, the-curve of in- crease in wild land checked very closely With the price actually paid by the settlers locating in the same areas. The related question: "If pieces of wild land are now being held for sale in your neighbor- hood, what is the price?". gave equally satisfactory returns, except that there was more of attendancy to state a range rather than a definite price. It was found that settlers know very definitely the price at which wild land around them is being held for sale. When the housewives were asked to break up their general answers about the meat bill into expenditures for different classes of meat beefsteak, reastsbeef, smoked and cured meats their answers were not so certain. They hays pretty definitely in mind the amount of their meat bills from their paying of them, but the distribution of this amount between kinds of meat calls for the exercise of judgment and not memory. ' The housewives' answer.to the question:"In buying meat, do you Shop between steres?" was divided into three categories, "much" 28, "none 112; "some" 57. The chances are that these answers de- scribed the buying pretty much as it was. The information asked for was a kind that could be furnished. The only indefiniteness about it was in the definition of 'much' and 'some'. - The question as to receipts from labor off the farm in a cut-over land survey, in many cases called for totalling in one‘ figure a good many separate amounts, and the settlers simply made a lump-sum estimate, as is made clear in the following table: ' Labor Off Farm by Cut—over Farmers. ‘ Frequency Distribution of Closeness Frequency Distribution of of Estimates, Amounts, $100’s 57 Less than $ 149. ' 14 50's 20 $150 - $299. 31 25's 5 300 - 449. 21 a 10's 6 450 ~ 599, 11 5's 1 600 - ”749, 5 1's _g_ 750 - 899, 2 Total - 89 900 and over, __§ Total - 89 Fifty-seven of the 89 settlers gave their estimates in hundreds, 20 in fifties, etc. The frequency distribution for the question "Sales of -timber products" looks almost exactly like that for "Labor o‘f the farm". The estimates of amounts spent during the year for groceries were to the nearest $100 in 28 percent of the cases and to the nearest $50 in 25 percent of the cases. 86' ' . >ff<4é Information as to expenditures and receipts and the like Can in a large majority of eases be obtained with sufficient precision for the purposes of survey analysis if time enough is taken tp.get farily complete information so that receipts and expenditures can be Cheede against each other. The first set of answers to the questions will very frequently show glaring discrepancies, but if the balancing can be done on the spot, a number of these discrepancies can be Smoothed out and explained. This method was followed with fair success even 110 with infonnation as to receipts and expenditures during the first three years Of settlers' experience in the cut-over country. The statement of aesets at the time_of settlement, combined with receipts and ex- penditures and borrowings during the three years following, was made to appear consistent with the statement of assets at the end of the period. The experience which farm management men have had with checking opening “2d CIOSing inventories amply illustrates this point. The difficulty -i: that this checking_often times takes a long time and becomes annoy- ing to the person interviewed. Certain other questions of a definite type did not work out as satiSfaetorily in the settlers' survey because they were too person— al in their nature. About 40 percent of the settlers who were asked the amount of interest payments made on their land contracts or mort- gages, specified sums to the nearest dollar, but only one-half of the remainder were willing to admit that they had made no interest payments; they either evaded the question or indicated by their manner that they did not wish to answer it, or made an unreasonable statement. A similar question as to the payments made on the principal of the con- tracts yielded even less satisfactory results.. ' The question: ”What cash have you on hand or on deposit in banks at present?", was answered definitely in about 80 percent of the cases. The rest evaded by saying that they did not know or in some other way. Enough of the settlers who answered had only a few dollars on hand so that it was_not difficult to understand the embarrasment of the others. Another type of question is that which distinctly calls for the exercise of judgment. An outstanding e ample of this is the question as to the value of the farm. Experience has demonstrated that in some cases, such judgments are dependable and in other cases not. In the study of land values in Indiana by the Bureau of Agri- cultural Economies, the actual sales prices were available,in some cases and the farmers estimated the vsites in other cases. So far as available tests could demonstrate, the latter were nearly as satisfabtory as the former, However, when the farmer is asked to separate the value of the buildings from that of the rest of the farm, he may go very far astray. In one set of data of this sort, an in— verse correlation was discovered between value of the land per acre and of buildings per acre, attributable to the fact that the buildings had been relatively overvalued on many farms and underValued on others. This result is very likely to happen in a region where land talues are declining. The farmer assumes that the building values should be based on some theory of cost of reproduction now less depreciation, which of course is entirely erroneous in such a case, 87 The following table showing the results of an inquiry as to the value of household goods on farms in Northern Minnesota suggests that the estimates are not very close, and also a very general tend— ency to underestimate all items of this kind: Value of Household Goods. Frequency of Closeness of Frequency Distribution of Estimates, Amounts. $lOO'S 70 less than $99 24 50's ' 48 $100 s 199 53 25's 17 200 — 299 - 38 10’s 7 300 - 399 15 5's ~1 400 — over _l§ E“ . 143 The value of results of questions of this kind depends a great deal upon the way in which the question is asked. It usually is not safe to put a mere heading on the schedule, such as, "Value of farm real estate", and expect to get satisfactory or uniform re— turns from the different field men. There is usually one definite way of stating a question which, if presented clearly, and repeated if necessary, will get the kind of judgment which is wanted. Efither this must be done in the schedule itself or in the accompanying in— structions. As a result of trying to cover too much ground in a survey, oftentimes questions are included in bald outline which cannot possi- bly get worth-while information in that form. If the infonnation is really wanted, it is necessary to go after it more thoroughly and systematically. This may be illustrated by a question asked in the land settlement survey, namely, "What live stock should the settler _ keep during the first few years?" No doubt the carefully considered judgments of actual settlers on this subject would be worth obtaining, but at least ten minutes would probably be needed to get them from the usual settler. First, there should be'separate questions as to each of the various classes of live stock. The answer will depend upon many circumstances such as the condition of the land at time of settlement, resources of the settler, the forage available in the community, the price of feed, the Ideal market for farm products, etc. The answers will be worth much more if the reasons for them can be ascertained also. In many cases the best practice is to list a number ofx items which are to be checked off. An inquiry as to vegetables grown in the farm garden should no doubt take this form. An inquiry as to sautees of outside income should probably list all the reason— ably possible sources. This is the only way to prevent omiSSionS. Inquiries for qualitative data should in most cases list all of the various answers that are to be expected, and the field man should stick with the subject long enough to feel certain which 88 one of the answers most nearly fits. The alternative to this is to put the questiOn in a uniform way and to record exactly the answers obtained. There are no doubt cases in which this is the preferred procedure. But ordinarily any classifying Of answers can be done best by the field man while he has his man in front of him. An ex— ample of a question illustrating this point is the following: “What is the attitude of the settler toward the price which he paid for his land?" The answers obtained were not capable of being classi— fied Satisfactorily. The inquiry should have been handled in such a way as to have definitely classified the various settlers as to whether they thought the price was "too high" "all right", or "too low“. Any qualificiations which they wished to attach to their answers could have been included by the field men in addition to the foregoing and would have been of some value in the office work later. Another example is an inquiry as to leasing arrangements. It would be advisable to have the various leasing systems prevail— ing in the territory definitely understood in advance of the survey and listed on the schedule and described in the accompanying instruc— , tions. If the field men actually found a leasing system different from any of these or representing modifications of them, they should of course record these. The information obtained, however, would be much more accurate if as much as possible of it were reduced to definite leasing systems in the field. Another inquiry illustrating this point related to whether the land company has misrepresented the land to the prospective settler. The answers obtained would haye been more usable if it had taken the following form: "Was there misrepresentation? What was it?" A type of inquiry of this same general sort is that which _ relates to attitudes. The most satisfactory procedure WhiCh has thus far been developed with resPect to questions of this kind is as fol- lows: First a preliminary survey is made of the subject and a list is made of the answers representing all the different shades of attitude. The various attitudes are arranged in order from one extreme to the other. The person interviewed is asked to read the list eat fully and check the one that fits him best. Where this procedure is not possible, a substitute for it is asking one general question and having the field man read two or three of the attitude descriptions which seem most nearly to fit the answer given in any case in which the field man is unable to classify it himself. In the land settle— ment study, an inquiry as to the attitude.of the settler toward his settlement undertaking gave only partly satisfactory results because the various field men soon drapped into Special ways of their own of classifying attitudes. It was possible to Say, however, that a certain proportion of the settlers were optimistic and the rest were not. An example of questions which do not specify definite quali— tative answers, but obtain replies which are easily placed in a few classes is the one reported_follqw§ggreSpecting quality of land: ~.' Rough ------- 5 , Level —————— 65 Rollings ~ ~ — _,_ 40 Flat ------ 24 Slightly rolling -_21 No answer— — - — 3 89 One of course has a right to raise a question as to how ‘1ignificant these replies really were, in view of the fact them different settlers had dfffcrent 1deas as to what constitutes "rolling" or "slightly rolling" land. a This came out even more clearly in an inquiry as to soil typeS. What many settlers called "loam", others called "sandy". This difficulty can be obviated in part by the method of having define ite categories closely defined in the set of instructions and require ing the field men to followwup their questions more closely. Many questions asked fail of their purpose because stated too indefinitely; for example, "Was the land inspected before pur- chase?" Although this got a definite "Yes" answer in 106 cases and a definite "No" answer in 9, it did not go very far toward supplying the information wanted because the term "Inspection" means so many different things. Likewise, the question "What foods are most im- portant in your household?" must surely have failed of its purpose because of the vagueness of the phrase "most important". The ques- tion: "What supervision does a tenant receive?" is likewrse altogether too indefinite. Questions are frequently asked which the persons interviewed are not really qualified to answer; for example, the question:"How many acres in your farm are permanently not fit for cultivation?" Most of them answered "none", when, as a matter of fact, the great majority of farms had tracts too stony, or sandy, or swamps, to make profitable cultivation of it possible. Similar questions are those which relate to the value of different sorts of goods and services on farms and in homes. The inquiry in a consumers' study, "Are your purchases influenced by the price?" is probably open to the same ob- jection. There were 57 who answered "Yes", 106 who answered "No", and 35 who said "Partly". The situation probably is that few of us really know except in a vague general way that is meaningless, whether or not our purchases are influenced by price. This same objection can usually be raised to most questions which ask reasons for conduct; for example, "Why did you become a settler?" There is no difficulty about getting an answer to a question of this kind, but how correct is it after it is obtained? Most of us have fabricated a reason for every step in our past records; but in more cases than not, some other explanation would be a better one. The question, "Why do you eat meat?" brought forth answers which illustrate this point very well. The answers, in the order of their importance, were as follows: "Like it", "Want it", "Need it" "Habit", "Should have it", "Does not know", "Make a meal", "Everybody else does", "Health", "Nothing else to eat", "Nourishing". It would be interesting to put beside these answers a psycholigist's statement as to why we probably eat meat. Other questions probably get answers which are not signifi- cant because they do not fit the situation; For example, the question "What kind of meat do you prefer"? does not fit the usual situation, for the reason that most of us prefer no one kind of meat at any and 90 all times. The answer, (beef 137, pork 41, lamb I3, veal 12) does probably indicate in e very rough way the prevailing preferences of the population for beef, but there are many occasions when some form of the other types of meat are preferred,_ ' There are many other types of questions which get answers which are interesting but whose value ie.questionable in that they misrepresent.the situation or do not furnish a basis for a usable answer. The number of persons in a thousand interviewed who used "Sunmeid" raisins does not_te1l us how many are using them because they have been advertised; nor whether the advertising campaign has been profitable, Not remembering Speeific points about advertise- menfig of meat does not prove that meat advertising has not been profitable. . . ,- . - * .‘fi 91 (.) DAIA OI PRIVATE AND PUBLIQ_RECQRDS\. 'An increasingly important source of data is the records of farmers, private_business firms and cooperatives and departments of governments. Most readers of this handbook are familiar with the work which Dr. Stine is promoting in assembling price and related data from records of farmers, rural newspapers and buyers of farm Prod.‘ucts.(x The following reports upon projects at Cornell univer— sity outline the nature of the problem: , ' 1. Data from milk dealers, by H. A. Rose: Professor PearSOn has requested me to write you relap tive to the method, cost, and other details of obtaining data from milk dealers. 'I can only give you my experience in this work, which has been limited to two major studies completed, one under way and some minor work largely for Specific information essential to the solution of some pressing problems such as rate caées, milk shed extension, etc. In general, I have found the use of forms for the tab— ulation of data from dealer's books useless, because no two dealers maintain the same type of record and it is cheaper to make the necessary computations and changes in the office than in.the field. Obviously the aims of the study must be kept well in mind to avoid transcribing useless information, but so far I have found blank tabu- lation paper the most desirable survey form. Only among the-larger dealers does one find complete . records extending over any great period of time, with the exception of milk purchases, which are frequently kept for many years by even the smaller plants. The business is so concentrated, however, that records from two or three dealers~will frequently cover the majority of the business in a given market. Ordinarily, dealers are willing'to cooperate in such studies, although when first approached they gen- erally fail to see any advantage to themselves. After results of one study are available, they are usually eager to supply data for another. The chief difficulty in many cases is the inadequacy of records (x) See'paper on "Research in Prices of Farm Products" and Dr. Stine‘s discussion following, Journal of Farm Economics, Jan. 1928. . ‘ i'7 n r.,_ . _h . J" . of the smaller dealers whose aggregate volume is an im- .s'” .portant part.of the markct~supp1y. Itris cuetOmaryvéor » ‘ milk.dealers~to_cross—check quantities and values in their accounting‘work,- Forwexample1.purchases of milk by-plants must'check with utilization.’ The percentage of fat as shown by the plant test must check with.the drip test at the city plant. ' The cost of such work is fairly high. For New York demand study, the cost of travel was approximately $600. Clerical help'in New Kerk costiabout $150. In addition, three clerks here worked on the study for about a year. This time could have been'shortened considerably by the use of a tabulating machine, which we now have. - The deliveries of milk by producers is a figure which is kept by most dealers and we hope to get it for prac— tically every plant in the New York milk-shed and nearby territory.' The information on.each plant will consist of the following items: month, year, dealer, plant, months Operated, state and county, railroad, freight sane, pounds of milk, number of dairies, pounds per dairy, ratio of November to June, test snd‘grade. we already have this information copied, or it is being copied for us on about three—fourths of the milk supply, and we started work less than three weeks ego.-‘ Obviously, the other fourth of the supply will take very much longer, because most of the companies left operate but one or two plants. Data from country milk plants, by C. K. Tucker: I have given most of my time since August 1925 to the collection and compiling of data on the costs of operating 38 raw milk plants, 18 pasteurizing plants, 15 bottling plants, and 10 cream plants in the-New York milkshed for one year. I visited the administrative offices of the several milk companies furnishing us this data, and copied complete cost records for each plant and got whatever other general information I could upon the operations of the plants. Then I visited each of the plants studied and by means of a questionnaire, completed the collection of the data. The most important information obtained at the plants was the use of labor and supplies and a diagram of the buildings with the layout of the equipment. Data from apple dealers, by Leland Spencer: I might say that we obtained from each of 15 firms a de— tailed record concerning each car of apples shipped during the past three to five years. The nature of the informa_ tion obtained is shown by the enclosed tabulaticnreardr*"\ Each of these cards isuintflndbdrtU“VE217fi¢hQ information , concerning a single_lo§-of apples in a car. 'Tn‘the ma- ‘jority of cases,~eeveral lots of apples are shipped in one‘carlii..ei‘several different varities, grades, sizes and the like. We have not as yet made a count of the exact number of cars for which we have records, but my impression is that the number will run up to between 5,000 to 10,000 carloadswfor the 1926-27 season, with~ somewhat fewer cerloads for the previous years. I would estimate that there is an average of at least four lots of apples in each car. - In addition to the information indicated on the tabu- lation card, we also obtained data concerning the costs intervening between the shipping point and the first sale in the city on carloads of apples that were shipped on consignment or joint account. It will be necessary for us to use a second card for the tabulation of these data.' . l , . . ‘ . Regarding the uses of such infonnation. I would say that we plan to determine the effect of variety, grade and size of fruit, the kind of ppckage, method of sale, place of sale, accompanying varifiies or products in the same car and other factors upon the net price to the shipper. we are particularly interested just now in the kind of package, as there is a strong tendency. to shift from the barrel to the bushel basket .in this territory. we intend also to )determine to what extent the diff rentials according to :variety, grade, size etc., are re lected back to the growers.v - ,> ‘I should hays stated that we also obtained data con- cerning the prices paid by the shippers to the growners. These latter data are the least satisfactory of all data Were obtained from the actual records kept by the shippers and seem to be entirely dependable. The prices paid to growers are not ??paally recorded in detail and in only a few cases uas possible to obtain satisfactory records of this sort. With regard to the laying out of the'work, we first visited several shippers in order to ascertain what records were kept by them, then listed the items of information that we desired and proceeded to copy on ordinary tabulation paper the detailed.records concerning each carloéd'shipped. *~\Date of shipment, consigner, and his address, varities, grade, ' size, number of packages, kind of package, gross and net price, kind of sale. 94 ,J 1 - -‘ .. it": ' 4. "3).“. .4 '.." The copying of the data reQuired the equivalentrtime/of myself and two assistants for three~nnnths ;a{he tabula, -tion of the data, punching of. card, etc., wi11 requ.i.re something like the time of two girls Ifor six months;>plus the supervision on my own part. ‘ . "r Data on marketing of cabbdfllo, by E. G. Misner: ~” Most Icf the tables are from the becks of one produce dealer. These were a very complete, carefully kept -set of records, extending baCk' to 1894. We have' anothe-r set of such books in the Department giving much data - for different farm crops, both as to prices paid and prices received, extending back also to 1890} We hare had these books in the office for three‘years, but due 'to pressure of other work have been unable to get‘ them ‘worked uP,- I think much useful information about the marketing of farm crops can be obtained- from the study of such records. It is a cheap method of collecting information toIo, because the only expenditure IisI Ifor .dlerica1Ihelp. ' " Data Ifrom tax records, by M. Slade Kendricln It is my belief that here as elsewhere in statistical studies, the priCe of accuracy is eternal vigilance. This is particularly true in the case of those public documents which are assembled by men of little training; May I ill— ustrate by -speaking of an experience which I had today? In going through a supervisor's report for a county in this state, I found a table which purported to give the totaL compensation received by supervise!) for their services in 1926. One item in this made me suspicious that not all campensation was included.-”Théreup0n I searched through -the report and found $3,200 which-had.been left out of the ' table. But errors sometimes-creep into a public report compiled by a competent statistical staff.‘ For example, 'in one of the tables of general property taxes published -by a good state tax cemmiSSion, local school taxes are included for only a few years and then dropped altogether. There 33 nothing in the heading of this table or the foot— ' ~notes attached to it to give one an inkling of the treat— ment given sshool taxes. There is, though, a significant difference in the tax totals which make the division be- 'tweeIn the inclusion and exclusion of sohdo1 taxes. ‘This should make One suspicious. Personally in spiteof the difficultzves., with public“ records, I believe that they are worthy of use for star tistical purposes. There are abundant possibilities- of error in any data and I do not think that public reports are more liable to error than other statistical material. Often there is the difficulty that,the Heterial is not 95 quite in the form which the investigation would prefer. Bit this is true of most statistical material except that gathered by the survey method. 6. Data from farmers' diaries and Journals, by James E. Boyle: _We discover and salvage these old Journals and diaries chiefly by means of an educational campaign put on thru our fifty-five Farm Bureau Papers and the local and met— ropolitan press of the state. Twice I have conducted state-wide contests offering small cash prizes for de- scriptions of material of this kind which.individuals happen to possess. When we locate an old document, we submit three options to the owner:- (1) That he loan us the material long enough for us to copy it: (2) That he deposit it in our library subject to call when he wants it; (3) That he give it to us outright. In this way we have had several hundred old record books pass thru our hands and we have perhaps twenty-five or thirty books which have-been given to us. Rarely_do we actually buy? material'of this kind._ we did recently pay $25 for a farmer's diary covering 55 years, daily entries, and filling five large volumes. Our first aim is to salvage this material before it is lost or destroyed. It is very rapidly disappearing forever. A great many of our records consist of Journals or ledgers of small country stores. We assume that such data are almost 100 per cent dependable. Also such records as farm ‘ages, salaries paid school-teachers and usually prices received for products sold. How this material is compiled depends, of course, on the use we wish to make of it. A recent doctor's thesis on the history of agriculture in this country, for instance, was based largely on something like 100 of these old account books gathered from this one county. The mater- ial was compiled both topically and chronologically. ThiS‘” perhaps gives you some idea of the work. I am satisfied that the six New England States should be combed in order to salvage and preserve these fast disappearing records. Using Business Records of Farmers' Elevators.* This report deals with the use of business records in connec— tion with a 5-year study of farmers' elevators in the Spring Wheat Area begun with the 1924—25 season. For the first season, records were ob- tained from 40 farmers' companies; for the second season, from 61 com- panies, and for 1925-27, from 96 companies. North Dakota, South Dakota. * By W. J. Kuhrt, Division of Cooperative Marketing, U. S. Department of Agriculture. _ ‘ ' 96 ~\Montana and Minnesota have cooperated with the Bureau of-Agricultural Economics in this-study. About 80 percent of the elevators included may be considered as operating on sicooperative plan, since'they have provisions which limit the percentage of dividends payable on capital stock and permit the payment of patronage dividends if funds are a- vailable. . . 2 The eleVators studied were selected on the basis of the completeness of their records, the attitude and interest of their manager and other officials. and their_geographical location. An effort was made to obtain a sample stratified according to types of grain ETOWing. Data on costs of operation, variation in protein content, and general information were obtained from practically all of the 96 elevators studies in 1926—27, but satisfactory records of hedging operations were available and usable at but 46 of them. - 4 Two men in each state did the field work.in.1926-27. The forms for recording the data were prepared by the Department of Agri- culture. In the Spring Wheat Area it has been found expedient to begin field work about June 1, and an attempt is made to complete the field work each season before'the new crdp Movement begins. At the conclusion of the field work some time is spent in copying audit data. Much of this material is available at the offices of auditing companies located within the area. Forms have also been prepared for taking off these data. V Complete detailed information is obtained covering organiza- tion and operating practices. The data are planned so as to permit special study of certain major problems confronting the group as a whole, such as those of incomes and costs of operation, storage, hedging, and variation in the protein content and premium values of wheat and durum. For analyzing incomes and costs of operation. data are obtained to cover: (1) volume and sales of each grain and each side~ line handled; (2) expenses of operation, including an estimated a1- location between the grain and sideline business handled; (3) gross trading profit or loss from each grain and sideline handled; and (4) general information with regard to equipment, services rendered and other related factors. These data permit rather accurate and comprehensive determination of the costs of operution of each ele— vator, net income from the grain and side—line business handled, and costs and income per unit handled. Storage operations haye been analyzed in detail only for the 1924-25 season. Records of all storage transactions-were obtained showing the bushels involved, length of storage period and storage charges earned, collected and waived. In the analysis of hedging operations, an attempt has been made to determine (1) how closely each eleVator actually kept hedged in each grain throughout the entire season. This necessitated build— ing up daily "long and short statements for each elevator in each grain as at the close of each day during the season. From these 97 statements, it has been possible to compute: (1) the losses or gains from price changes on all grains not hedged Completely, (2) the actual financial- outcome to the elevator, (3) the probable financial outcome if the elevator had kept completely hedged, and (4) the probable fin- ancial outcome if the elevator had used no futures for hedging. It has also been possible to compute the actual losses or gains of clever tors from "spreads" between cash and future prices and between futures used for hedging purposes. All of this information is necessary if a complete and accurate analysis of the hedging operations by farmers' elevators is to be made. In obtaining field and audit data to permit such an analysis, it has been found necessary,to obtain: (1) amounts of actual grain purchased each day; (2) amounts of actual grain sold each day; (3) complete records of future trading transactions including open trades at beginning and end of each season; (4) market positions at beginning (and end of each season and much other general information. Beginning with the 1924-25 season, data have been collected which permit the following analyses to be made: (1) extent of varia- tion in protein content between wagon loads delivered by certain farmers, (2) extent of variation between cars sold, (3) premiums or discounts paid fer each car sold, (4) computation of the changes in importance of certain of the important quality factors affecting premiums and prices paid at terminal markets from time to time throughout the season, and (5) methods used by elevators reflecting premiums or discounts to growers. Special attention has been given the last- named feature during the last two seasons, since it has become one of the most serious prob- lems with which farmers! elevators are at present confronted. Costs of obtaining such records vary of course with the number of records obtained, the quantity of material to be obtained at each place, the completeness and accessibility of the records, and the sal— arise as well as experience and efficiency of field men. It has been found economical to have two men travel together by automobile. Two experienced men can usually obtain a complete record from an elevator handling around 200,000 bushels of grain in about two days. If cer— tain of the time—consuming records, such as, for example, hedging data, are omitted for any reason, the time may be materially reduced. A "short" record may be obtained oftentimes in less than a day. The copying of the audit data necessary for a complete record usually requires.about two hourS. In copying certain types of records, it has been found very expedient to have the two,men work together, one reading off the data while the other copies. This works to advantage because of the difficulty experienced in keeping one's place in both account books and research forms at the same time and without errors. I _1 ' i '. Table 1.:- Estimated schedule of expense, 1926- 27 : Bureau of Agri- : Four State : v. . , : . cultural : experiment : ; Items of expense : Economics :: stations : T0£al . Salaries : M .1,1oo.. : $ 2,800 : $ 3,900 Subsistence : . ' 575 _: 1,775 : 2.350 TranSportation : 525 : 900 ' 1.425 Clerical : 523,475 :_ 200 : 3,675 : : : ~Total § : $ 11.550 '$:5;675 : $ 5,675 ”N ' s H The estimated schedule of expense for the 1926—27 season is shown in Table 1. ' This.was the first season in.which all four of the Spring Wheat states participated. 0f the 95 records 45 were complete and 51 were "short"recorde. :The approximate average cost of complete records. including office Work, in 1926—271Was'$146, and of the "short" 9 ‘recorde $36 ' With a heavier run of grain, the costs would be higher. One of the most serious of the problems encountered that many managers of farmers' elevators hesitate to.give out information concerning certain phases of their business, such, for example, as their hedging policy or their methods of -meeting competition. This condition is aggravated if the interviewer_ is a stranger to the manager. In some cases, the desired information has not been obtained, or state- ments made have been unreliable. In some this difficulty has been handled by obtaining the data necessary to check verbal state— ments made by managers. Another difficulty is that many of the records kept are not complete and accurage enough for Some parts of the analysis. It has been found.difficult to stimulate the interest of.managers enough so that they will keep the Special records needed. ’Calling a meeting of -managers and directors at which the plans and methods of the project have been discussed, has helped somewhat. The interest of the man— agers has increased as the study has progressed. This is probably partly due to interest in the preliminary findings and partly to ‘personal contacts and discussions. Considerable difficulty has been experienced in obtaining complete audit data. The attitude of public accounts in the area has been unusually favorable and for the most part their reports furnish the necessary information. But not s0.the so—called "local" audits, made by‘local accountants, bookkeepers or committees appointed by boards of directors. Such audits are seldom complete, are lacking in uniformity, and are often inaccurate_because 0f inexperience or lack of proper training in accounting. Whenever possible, elevators having local audits have been omitted from the study. The final problem encountered is that in order to obtain complete and accurate records, elevators must be selected which nodoubt are above the average of elevators in the area in many phases of their xi operations. This applies particularly to bookeeping and accounting methods. But greater efficiency in these activities is often ac- companied by greater efficiency in other resPects. Thus the study will probably set up a standard of efficiency which is well above the average of the entire grouprin the Spring Wheat Area. Usingggommercial Audits as Sources of Data.* The primary purpose of an audit is generally to show the financial condition of a business; 'Hence this part of an audit is generally the most complete. Experience with audits of Minnesota business firms (which have been chiefly local marketing enterprises) shows that the standard classifications indicate the methods of fine ancing, but give little'information as to the sources of capital. They frequently give Only scant information as to the method of valuation x of the assets, and are partiCularlyflacking in detailed infonmation , that explains the character of the assets. ; r A comparatiVe analysis of balance—sheet information is help— ful, however, in pointing but the fundamental policies in financial administration, and sufficient supplementary information may often be obtained from an audit to explain variations in the distribution of assets or liabilities among the major groupings of items. Relatively large figures for "accounts receivable" may be explained, for example, by supplementary exhibits indicating a large sideline business and a policy of selling sidelines for credit. Likewise deferred charges may be explained by a supolementary statement of insurance policies. Or the relation of the "fixed" to the "current!“assets may be ex- plained by the nature of the commodities handled as given either in the weight statement (as of a farmer's elevator) or in the detailed statement of income. Such comparative analyses, although they do not explain financial practises in great detail, are_nevertheless worthwhile in pointing out important considerations in developing a financial pro- gram, and they do give a sufficiently fundamental index of financial policy as to'—provide much information that is useful in extension programs. (See Minnesota Bulletin 224, Management Problems of Farmers' Elevators) The "operating" statement of standard audits gives detailed information of costs and incomes (both gross and net), the two prin— cipal tests of the efficiency of a business firm. But like the fin— ,ehcial data mentioned above, they are generally not sufficiently com- ‘ plete to permit anything like an adequate economic analysis of the F factors affecting the efficiency of operation. The data on total costs and incomes are accurate, and the items that make up the total are also generally given. It is therefore generally possible to as— certain the reasons for variations in costs and incomes to the extent *. H. B. Price, University of Minnesota. ‘\\ -100 \\\‘dflfinrthflyv aIB.made up of component items of outgo and inco var1ous parts vary between different business units. \ me and these eitemized costs and income Of a farmer's elevator business for a given -year, as presented in e _commercie1 audit, ‘is—likely to be about as follows: ' I Qggrating Expense of a Farmers' Elevator Manager's Salary...... .......... . ...... . - Director's fees.;...............u.. ....... . Repairs.......n. ...... . ..... -....,.........3 ‘ Rent. ..................;......a....;.g. Depreciation ‘ . ...... -.....;.....;..... 'mditing& bOOkkaeping...o.......-...-o.-n- '.P08ta@cnoioa ....... oucw'e-oeoo'I-T-vzoo-i'nnoel-co,- 'StationarY' &print1ng......u vane-oin'ionuno"... Telephone.- -te1egxaph....................... Legal feeS................................. Depreciation, furniture.................... Licenses and bonde..........;.........-.... . Market reports....a......a..;.......;.u..;. ' Income- tax........:. . .....5 ...... ,......... IDonations and publicity.......;..;....;....i. Clerk.salary................-..........;... Extra labor....................a.;......... Power. light and heat......’....'............... Taxes...................................... Insurance............ ..... .........;....... Interest..............;.;.....I............ Miscellaneous.................... .......... Operating Income of a Farmers' ElevatorL Wheat-uuni-aq-o.-ulono¢oootone-G:u._..-.‘i‘.oo'-II FlaonIoOQ-eoIelcs‘o'ooocIncine‘oouooeuoppvaoel“ Rye.....................,.................. oateoIn-ooncqInfutooeuv-b-ceaccoo'nsconce... Barley...,..;......g..........;........,... Corn..................................;.;.. Foel....................................... Flour, feed, & seed.............:.......... Grinding................................... ,InterestS‘and discounts...................._ Mi'scellaneouSCIQOOOlee-looI’OIOOI:OIIIIIUOICD . $2400.00 188.00 -129.00 -~ _ 55.00 ‘ 799.00 -172.00 33.00 45.00 x 172.00 ’. 31.00 35.00 10.00 _;-90.00 .309.00 '342.00 1200.00 A 1389.00 (1850.00 ‘708.00 744.00' ' 357.00 ’ 284.00 $_.217.00¢ ..1689.00( m2&00* ' 11542.00 ,245.00 6026.00 1553.00 2097.00 3091.00 459.00 574.00 1 Loss‘ - --r - _ '\ \‘ \. 101‘ ~-~ . ‘v. . .—~_\ .' ' “‘.-.’. ."t7 --J )l‘ W'rh ~ N . \N. SWr other More will give us a basis for comr- \ paring the expenses and pointing outwsame of’the reasons for varia— ‘\tions in efficiency. One obstacle in the Way of this is that dif- ‘ferent classifications of expenses.and incomes Are frequently used, andvoceasienally different kinds of information are included under . the same items for different‘business units,nwith not enough sup— ‘plementary data to enable the research worker to make the necessary adjustnwnt of data between items. ' But such information as the foregoing will not permit a full explanation of the reasons for variation in an item between busi- ness units. Take the item of labor, for example. One may be rea- sonably certain that the figures for this expense are comparable, but one needs to go back of the figures to discover whether it varies because of differences in wage rates, amounts of labor used, orwadjustments of labor supply to a seasonal business. An audit report is generally silent on such subjects. Even though adequate informatibn for purposes of analyses may not be included in these audits, what they do contain is so easily secured as to repay Well the effort.¥ Such data analysed each year may also at small expense indicate , the historical changes taking place in the industry or field of busi- ness. The accompanying summary shows ever a 4-year period the change in the important tests of efficiency of elevator operation, namely expense, gross income, net income and volume. That import- ant test of financial administration, net Worth, is also given. Grouping these reports by districts would give geographical sig- nificance to the results. Volume of business varies from year to year, and with it the per unit cost of operation, the net income and the net worth. Gross income (or operating profit) tends to remain fairly constant from year to year, fluctuating much less than cost of operation, and therefore than income. Not worth seems to be slole but surely increasing, which indicates an ability to meet competition successfully, and a conservative financial policy. Such conclusions amply justify the use of such data in projects based solely upon them. As to their value as a source of data to supplement field records, no evidence is needed. Summary of Expense and Income of Farmers' Elevators in Minnesota by Years. (in cents per buShel) Gross Net - Year Expense Income Incohe Volume Net Worth 1923-24 5.6 6.2 0.6 166,566 $15,868 1924-25 4.3 7.5 ' 3.0 205.800 19,703 1925-25* 4.7 5.8 ; 1.1 223,252 22,985 1926-27* 6.2 7.9 1.5 139,622 24,299 *Excludes elevators handling more sidelines than grain. w .,'_, of w", ‘ ‘ '. _ . SamPlin§:\'x' \ "Frequently the—busbnecs units includ~ '“' ed in a study of this kind constitute near- ly‘a cqpplete universe. For cxarple, it zay'be possible to obtain business records for as much as three—fourths of the total volure of business handled in the.area. Certain sorts of generali- zations can be very safely rade vith.such a large proportion of the volume of business, but other sorts cannot. For einnple, date upon daily and seasonal consvrption of milk can be generalized safely on such a basis, but probably not the data on delivery costs, since the very firns which were omitted are likely to be following very dif- ferent practices in this respect. there a considerable number of business units are involved, then the principles of sarpling apply in their usual way. The infiger has already been pointed out that only the better units will be included because only these have usable records. In wany cases 'here Will be no escape fror this situation, and the only procedure possible is to wake clear that the conclusions apply only to units of this grade. If a certain a cunt of information can be obtained from the rest of them, it will usually be possible to indicate to what extent the conclusions applv to the whole group. It is advi- sable in many studies of this kind to select the business units in a ruch were nearly random ashicn than at present, and obtain sore sort of information fro“ all of they, :orking it up in corplete form only for those which are cblc to give it corpletely. The plan of sectioning the thole revion and covering all the units in cer- tain portions of it should also be followed in some cases. There will be more situaliens in rhich the :othod cf stratify- ing according to sets of conditions as above outlined Till be desir- able, than when farms are being surveyed. (See the discussion of this under the Survey Method.) Accuracy: In working With records of this sort, no one consideration is more important than " that of definition of classes. Those tak- ing off the records must be absolutely certain as to the practice followed in classifying various expense and receipt items. As pointed out by Dr Price in his discussion of grain elevator records, different con- cerns follow different systems of classification. Norking out a uniform system of classification into which the records of the va— rious firms can be made to fit usually proves to be one of the most difficult parts of such a study. A certain amount of arbitrary splitting of items often proves to be necessary. In such work the mistake is often made of grouping the items into certain large classes when the records are being transcribed and not recording the details of What haS'been so included. This frequently destroys the value of the data. The best'practice to follow in transcribing records of this sort is to bring all the de- tails back to the research laboratory. 103 _ Mere figures in dollars and cents of receipts and expenditures ~oftentimes have.very little research value. It is the relation— ships which can be established between these and other factors that are significant. Some of these other factors may relate to physi— cal volume of commodities handled, physical facts about layout of the plant and equipment, details as to organization of the work and the personnel of the business and distribution of labor among va- rious activities. Oftentimes much of this sort of information can— not be obtained with a high degree of accuracy. More effort spent upon getting such information with reasonable accuracy is often more to the point than.getting expenses-and receipts accurate to the last cent. ,1 \ - .. I - ~13) SUPERVISED'KECORDS . Because farmers, small marketing agencies and the like rarely keep financial and operating records which are adequate for the purposes of research, workers have frequently resorted to the practice of assisting them in the keeping of such records or even keeping the records in large part for them. The most common procedure in this respect is to introduce a stand- arfltzed system.of records. Regulatory agencies, with power to require uniform records, have long follOWed this practice. Trade associations are following this practice increa51ngly. The plan was early accepted in the field of marketing. In the beginning, hOWeVGr, rarely did enough marketing units adopt these systems to furnish any considerable body of research _ material. Recently it has met with much more success in this field. Iowa, for example, has a considerable number of livestock shipping associations keeping such records; and a group of about fifty creamgrles has just com- pleted one year of records. Illinois and Ohio have also been successful in getting uniform systems used by various types of marketing agencies. No doubt this method of obtaining date for marketing units Will be used a great deal in the future as more funds become available for research in this field. ‘ The difficulties encountered are easily imagined. One of the most important is that a system‘which will satisfy the largest and best managed units Will not suit the Weakest and poorest managed. Tt is feasible in some cases to introduce two different systems. Also, generally speaking, the units which will keep any kind oi record are considerably above the average. The results may have research value for the last three of the four objectives named earlier, but surely not.for the first objective, namely, to present an accurate description of conditions. Another difficulty is that managers or secretaries are constantly changing. It is usually necessary to visit all of the units adopting a uniform system at least at the time of its installation and at the close of the first year. Additional visits may be needed early in the first year, and there is likely to be a considerable body of correspondence for a while. An example of the use of this method in Iowa with household activities is described by Dr. Elizabeth Hoyt in the Journal of Farm Economics, April 1927, (p.218~20). At that time, 62 families were holping to keep such records. In the field of iarming operations, the more usual practices along ~the line of supervised records take the form of route studies, the farms being visited at intervals during the year as well as at the beginning and end, sometimes as often as once a week. Mr. J. R. Hutson discusses this in his article just below. The subject is not complete without’; including the sort of farm financial and production records that Ill- inois and a-numbenfof other states are trying to get large numbers of 105 their farmers to keep. These records are supervised to the extent of having them checked over carefully in a central office. No doubt the major errors are detected in most cases. bentien must also be made of the poultry and other similar records that are being kept in a number of states, usually under the inspiration of extension specialists. Tho research value of these is narrovly restricted to certain objectives, but nevertheless real for these objectives. (*) Route Records of Farming Operations. Supervised fare records of the route type may be divided roughly into two groups, (1) these closely supervised and (2) those not co closely supervised. The kinds of information obtained and forms used, the frequency of visits in supervising the records, number of records included in the unit of study, the cost, the kinds of information best obtained and the uses of which they are best adapted will be dis- cussed briefly for each group. ' Elosoly Supervised 1. Information obtained and forms used: Records The information usually obtained by closely supervised records and the forms used, include the following: (a) An inventory of the farner's resources including real estate; equipment, work stock, other liveStock, food, materials and supplies on hand, the farm.labor supply and notes on the man— agerial ability of the farm operator. The inventory is filled out by the supervisor when the record is started, usually from January 1 to harch l, with the assistance of the farmer coop~ orator. The following details are shewn: Real Estate-The number of acres of land and its estimated market value are recorded. All buildings are valued sop- arately. The water and lighting systems are valued sepa— rately from the buildings to Which they are attached. Livestock-Tho name, age, approximate Weight, and estimated market value of each head of workstock, each dairy cow and each beef cow is recorded. The other livestock is usually grouped by claSSes, ages, weight and breeding, and the value of each group recorded. Machinery-The number of pieces, size and value of each of the larger pieces of machinery is recorded. The value of srall tools, such as carpenter's tools, blacksmith's tools, etc., is usually recorded by groups. Food, Materials and Supplies—The amount and farm values of all feeds, materials and supplies are recorded. Growing crops-The number of acres of each of the crops for which labor has been performed the previous year, the amount of this labor and the amount and value of the seed and for- tilizer used for each of these crops is recorded separately. The man labor and horse work is reported by operations and the times-ever and approximate dates for each operation shown. * J. R. Hutsen, Bureau of Agricural EconomicS, U. S. Dept. of Agriculture. 106 Managerial Histories—The managerial history and labor supply card is filled out at the end of the year by the supervisor. l (b) A clear dayeto-day picture of the work performed on each (C) (d) farm, including the amounts of man labor and horse or trac- tor power used by operations, with the kind and size of the implement used, for different crops and kinds of livestock. The complete time of each regular workman is accounted for. This information is recorded daily in a labor book by the farmer-cooperator and checked and taken up by the supervi— sor at each visit. The amounts and market value of the different kinds of feed and pasturage consumed by the different classes of live- stock including rations and feeding practices. Ration changes for each class of livestock are recorded when made by the farmer—cooperator on ration cards or sheets. The ration reports are checked by the supervisor at each visit. In determining the rations used, in some cases the concen— trates and concentrate mixtures are weighed by the farmer; cooperator and in some cases they are weighed by the super— visor. In other cases, careful estimates are taken. The rations of hay and rough feeds are weighed in some cases " and in other cases the Weights are carefully estimated at the time of feeding. Usually a monthly statement is filled out by the supervisor with the help of the farmer—cooperator. These statements show the amount of each kind of feed used on each kind of livestock during the month. In addition to the yearly feed inventory mentioned above, often two other inventories are taken, one shortly before hay is harvested, and another shortly before corn is harvested. The ration reports and feed staterwnts are adjusted on the basis of these feed inventories and the purchase and sales of feed during the period. During the pasture season the supervisor with the assistance of the farmer—cooperator makes a record of the pasture changes at each visit. These records show the field on which each class of livestock is pastured, the number of days on pasture and the number of days of pastur— age furnished by each field during the year. Crop histories including the amounts and costs of seed, twine and other materials used by different crops, the amounts and market value of yields of the different crops and the times—over for the different labor operations with the size and kind of implement used and the number of acres covered for the different operations. ‘ A crop history sheet is filled out for each crop, and when radically different practices are used on different fields of the same crop, a separate report is made for each ,field. In the latter case, the labor for each field as Well ‘ae the crop histdry data are also reported separately. These sheets are filled out by the supervisor with the assistance of the farmerécooperator. The amounts of seed (e) (f) (s) (h) (i) are.kept by someone in the household. The different items 107 and fertilizer used are recorded at the first visit to the farm after a crop is planted. The number of times-over for each operation, together with the size and kind of im- plement used, is recorded at the first viSit after the par- ticular operation is completed. The yields, likewise, are recorded at the first visit after the crop is harvested. An effort is made to Weigh the production of some crops on some farms each year. Usually in the case of small grain the bags are counted and a few weighed. In the ease of corn and hay, a.few wagon boxes or loads are weighed and a record kept of the total number of loads. A Weather and soil condition report including the amount of rain falling from 6 A. M. to 6 P. h. and from 6 Pr M. to 6 A. M. each day and a general statement as to Weather and soil conditions during the Week. This record is filled out by the supervisor at the end of each week from notes made by farmer-cooperators on the labor records. Livestock histories, including the methods and practices followed in handling the different classes of livestock, the calving, feeling or farrowing dates for the different classes of livestock, the Weights and market-value of ani- mals when transferred from one class to another or when rad— ical changes are made in feeding practices, the weight and cost of the different classes of animals when bought, the weights and receipts of the different classes of animals when sold, and explanation of variations in gains and feed consumption. Monthly livestock history sheets are filled out by the supervisor with the assistance of the farmer-cooperator at the first visit to the farm after the first of each month. Expenses and receipts, including the amount of all items that can be reduced to quantities and the particular enter-'7 prise to which all expenditures or receipts belong. An ordinary cash journal is kept by the farmer—coop- erater who makes the entries at the end of the day during which transactions take place. This cash record is checked by the supervisor at each visit. The farm layout including a farm map showing the size, shape and location of all the different fields and location of all buildings and sources of the water supply. This map is made by the supervisor and assistants. An ordinary plane table is used and the acreage is calculated with the use of a planimeter. When the map is completed it is checked over with the farmer before the acreages are ac- cepted. The source, quantity and value of miscellaneous products furnished to the home by the different enterprises on the farm. Cards showing the amounts and value of these products i‘: -\ \ __.«-‘.-_...__ \\“uge~reoorded when thay.a:o*takan into the home rather tiafl when they are consumed. The munberigcjagge produced 61 -h day is also usually recorded on these cards. A cardrisjnwr’ vided for each month, which is checked at the end of “6 month by the supervisor. " \\\' ‘ loé ’- V'_.. .. I . 2. Freguency of visits-In obtaining the imformation indicated above, each cooperator is visited from 2 to 4 times each month. 3. Unit of studwahe number of records that one field man can handle is“ generally the unit of study. One man can handle from 20 to 30 of these records. Cooperators for these records usually are selected within a radius of 15 miles of some central point. 4. Cost-The cost 01 the field Work in supervising from 20 to SC :—3cords as described above usually ranges from $2000 to $2500 per year. The expenses for travel usually are about $50 per month and the remainder is salary. In cases in which a man is employed as superviSor with the view of having him become familiar with the study and conditions in the section so that he may do more effective extension work after the research study is completed, the expendi— tures for the field Work usually will exceed $2500 per year. 5. Kinds of information best obtained—Of the records listed above, labor, crop history, weather and soil and livestock ration reports reguire most caref‘ul SLpervision. The other records may be obtained accurately enough for most uses with less supervision than is indicated. 6. Use for which best adapted-Closely supervised records are especially useful in showing-tho amounts of seasonal man labor and horse work used for crop and livestock enterprises and systems of farming, the amounts of man labor, horse work, materials and cash used in growing crops and the amounts of man labor, horse work, feed, materials and cash used in handling live stock, and the re- sults obtained when different practices are used in growing crops and feeding livestock. They may be effectively used in connection with experimental and price data in providing a basis for comparing enterprises and systems of farming. They are used-in Working out normal production relations for the different crop and livestock enterprises, and in outlining systems of farming adapted to typical sets of resources and conditions and in estimating the returns that may reasonably be expected from different enterprises and systems. In this connection the chief function of the records is to localise the experimental data. ~ Because of the detail obtained, closely supervised records are especially adapted to illustrative uses. They are used to show how successful farmers have combined different enterprises into systems of farming that have proved profitable and how the complementary, supplementary and competing relations have been.taken into account in Working out these systezns. Such illustrations are extremely use- ful to farm managorent workers. ‘ They are used to illustrate variations in production inputs -.’- 4—. 5 i ,. \ w»- W and practices that have resulted in low inputs . The data made 1 available by the labor, crop history and weatherfreperts for crops ‘ these technological extension uses. and the ration reports for livestock in particular'are adapted to 1 Other closely supervised records-In some cases, detailed data as'described above will be obtained for only one crop or livestock enterprise, and inventories, expenses‘and receipts obtained for the other enterprises. In such cases one supervisor usually_will handle from 50 to 75 coeperators and visit each pooperator from 1 to 2 times each month. When detailed enterprise records are obtained for a class of livestock,_the visits are from 3 to 4 Weeks apart with additional visits at critical times, such as the farreWing season for hogs or the calving time for COWS; In the case of crop enterprises, visits are made every two or three weeks during the ‘ season when work is being done on crops and at the end of the year. In addition to the uses indicated above, usually enough data are obtained from enterprise records of this kind to permit an analyt- ical study of the results obtained when different practices are followed. ' ' Records 323 . Information obtained and forms used: The Supervised information usually obtained by the loss—closely C osoly supervised records, and the forms used, are as follows: (a) An inventory similar to that described under closely su- pervised records and obtained in a similar manner. (b) The amounts and market value of the different kinds of feed and pasture consumed by the different classes of livestock. Notes are made by the farmer-cooperatcr as to feeding practices. Detailed rations are not usually shown. The quantities of feed used over a period of two or three monthsis usually recorded by the supervisor at each visit. Inventories are used in checking the feeds as indicated under closely supervised records. (c) Livestock history records similar to those described under closely supervised records are kept. Usually records for two or three months are filled out by the supervisor at one time. (d) Crop histories include the amounts and cost of seed, twine, and other materials used by crops,.and the amounts and market value of the yields of the different crops. These data are usually ob- tained by the supervisor at the end of the year. (6) The expenses and receipts are recorded by the farmer—co— operator in a manner similar to that described under closely super- vised records. The records are checked less frequently than in the case of the closely supervised records. (f) The source,'quantity, and value of miscellaneous products.; furnished to the home by different enterprises on the farm. These data are usually obtained by the supervisor at the end of the year on forms similar to the monthly forms used for the supervised records. \’ \ \... . 2. frequency of visitsdanh,ooopanflun~‘ v ' I ~\ ’ times during the year. \ ‘ , ' l;s VisrtOd fro: 3 V0‘6f#’-" 3. Unit of study—Ono'field an can handle 125 to 1 ' . . ~ ‘ e 50 ds Eughbas are described above. This number together with a £330 and” e y some one in charge of the study from the ff' ' - ‘ ,ally-tho unit of study. 0 166 is gener 4. Cost -The cost of the field work in supervisin from 125 Ice 150 records as described above usually ranges from $3800 to $3600 per year. If more records are included, the cost will be proper” tienally larger. Cooperators for these records are usually so' lacted'within 5 or 4 counties in the same type of farming area. 5. Kinds of information best obtained-Loosely supervised re- cords are especially well adapted for getting cash expenses and receipts and quantities of feeds used by different classes of live— stock. Laber record data showing the number of times-over for each operation for crops and ration records for livestock are not easily obtained by such supervision. Uses for which best adapted-The records described above are especially valuable when used in connection with experimental data in providing a basis for comparing enterprises and systems of farm— ing They are useful in working out normal production relations for different enterprises, in outlining Systems of farming adapted to given sets of resources and conditions, and in estimating the returns that may reasonably be expected from different systems of farming. They do not make available detailed data showing the seasonal amounts of man labor and horse Work used by the different enterprises. , In a complete program of farm management research it Will often be advisable to use some closely supervised records, some more less carefully supervised records, and some survey records. Generally enough closely supervised records should be included to provide a basis for showing the seasonal distribution of ran labor and horse work for the principal enterprises under usual conditions in the section. These may be used to supplement a larger humber of less closely supervised records showing the feed used for the livestock enterprises and the cash expenses and receipts from different en- terprises. These records ray be supplemented by survey data for still more farms in the area in order to round out the sample and provide data for some enterprises not coVered in the route records. Production date obtained from.these different sources when used in connection with experimental and price data will provide a basis for comparisons between enterprises and for Working out profitable systems of farming. ;r.., A. Q‘ _ "use of the Method; Tho prospects are that research Wprk- are in the future will resort much more frequently than inLthe past to various forms of supervising the keeping of finan- ' cial and statistical records.- Such prac- tices are expensive, but frequently it is the only way to get re- cords Which are sufficiently accurate. As research Work becomes more refined, the tendency will be to resort more and.more to meth- ods giving more accurate data. Early work in any field very pro- perly uses microscopic methods,.but the time is reached when the important problems require a more microscopis examination. This does not mean that some particular type of supervised route nmthod'will come to be standardized. It means instead that practices will be developed which are appropriate to a large assortment of needs and situations. In some cases, narrow re-‘ stricted subjects will be studied for brief periods over a large sample. In other cases, similar subjects will be studied over a longer period with a very small group. There will no doubt be extensive developments along the line of having the participants keep records for part of the information needed. Public and private agencies will in more cases be induced to adopt systems of records which Will furnish data in the form.wanted, so that the route feature of.the method will largely disappear. Sampling: The route method can be conceived as collecting data to be analyzed by the methods of either formal or informal sta— tistics, or by the case nwthod. Most of the material thus far collected by such methods has represented too small a group to make the methods of formal statistics applicable. More than this, the determination of the units to be included in the group has been so at variance with the principles of scientific sampling that the statistical summarizations have only limited validity as such. Analyzed, hoWever, according to the methods of case prOcedure, they offer many conclusions of great usefulness. This Will be better under— stood following a reading of the section on "The Case Method" in a later section of the handbook. There is no reason, however, Why samples of sufficient size and representativeness cannot be handled by various forms of super- vised records, except that for many types of studies and in many situations the cost is prohibitive. In due time this will not be so limiting a circumstance as at present. Accuracy: While there can be no doubt that a system of supervised records gives more accurate data than the survey method, it is very easy to exaggerate the accuracy of the data obtained even by this method. Even though a farm is visited once in two weeks, much of the data recorded is in the nature of estimates and it is very easy for bias- to creep into them. The system of recording utilization of time affects the results. Estimates of quantities of feed consumed re- flect on the field man's ideas of measures. Other sorts of data .1112”; «<~*~~“ “neat Jag recorded With accuracy short of an’ unreasonable imam. “’3 painful hfl’ort. A good sample of this is the consummim ~°f forage bxwlivostock on pasture‘w The mere recording of. the number :- ' I 0f days Of 'beisture use leaves— roTOm for a Wide‘rtmge of variatioq._.... The digestible nutrients loonsmned'T'hyrhaavyJooders are likely t0~ have a bitxéfin one direction,» andxthom lay-light feeders a bias in the Opposite direction. Finally, a statere‘rrt of such thin s as the value 01' Vegetehles consumed by the farm family, of the va ue of the labor of childish, of the value of'the labor of the housewife in making clothes for the children, of theproportion of the super- vision of the proprietor which is given to the various departments of the business, is likely to be no more valid~than that obtained by the field man using the survey method. a) ”x. .. .“\ “" ‘1»;Ufi v (4) MAIIED quEsmxonfillass_.- I . : 1 There are two'types of use§\o(wmailndqugaticnnajres which need to be carefully distinguished, Those in which the questionnaire is sent to a list of regular reporters, and those in which it is sent to a fresh list each time. The first is the method of the Division of Crop and Livestock Estimates and will be discussed in a later section under the head of "Data from Reporters". . - The method of approach to this subject of mailed questionnaires has been as follows: First, Professor J. D. Pope of the Alabama Polytechnic Institute sent out an inquiry to all the divisions of agricultural econ— ‘6mics in the United States asking for a report upon their experience with the use of the questionnaire. Professor Pope's summary of this inquiry is published as he made it. Second. a comparison was_made of the 380 North Carolina returns: from the Federal questionnaire on farm income, with the 1115 survey returnsgpreviously mentioned. The returns upon the cost inquiry of the Bureau of Agricultural Economics were also analyzed to a limited extent. Finally, Mr. J. A. Becker of the Division of Crop and Livestock Estimates was asked to summarize his experience with the form and contents of mailed questionnaires. ' The Use of Questionnaires* A questionnaire may be defined as a set of questions to be answered by the informant without the personal aid of an investigator or enumera— tor. Usually the questionnaire is sent byomail, but it may be distribut— ed in person; in either case it is filled out by the one supplying the information. ' .- 4* An inquiry sent to departments of agricultural economics shows that the questionnaire is used to a rather limited extent in agricultural ecOn—' omic research in the state experiment stations. 'Some.of the departments have not used it at all, and most of them apparently have used it only occasionally. Very few may be said to have used it extensively. Proper Those who have used the mailed questionnaire' Egg agree that it has a place in research _ though a limited one. This statement of course applies to first studies, not to the continuing type of re— search in which reporters are used. In general, it appears to have been useful mainly in supplementing data obtained in other ways. The mailed questionnaire is a convenient method of obtaining a limited amount of information from a large number of persons or {rosin small ' useleétedfgrospswhdch is widely scattered. Sines the questionnaire is necessarily limited in the scope and amount of information it can carry, it is obvious that it cannot be considered a satisfactory sub; gtitute for the more elaborate and comprehensive survey and super; vised record methods. It is also obvious that with the questionnaire * Professor J. D. Pope. 114 there is no opportunity for explaining personally to each individual the objectives of the study and the precise information wanted in arm swer to each question. In Spite of these limitations, however, those who have used the mailed questionnaire have found it useful in econ» fmic research for the following purposes: l (l) Supplementing a study at certain points. (2) Providing preliminary information necessary in outlining a research project or preparing a schedule. , a (3) Obtaining primary data in certain types of projects in which it is the chief source o£ information.1 ' (4) Enlarging the background of a problem and filling in the gaps between areas. (5) Checking up on changes. (6) Checking secondary data already available. (7) Disclosing new problems.' ‘ Questionnaires on most problems have their ideal function as a 73 means of supplementing and giving a geographic generalization with reference to points on which the use of the interview method has pre— viously revealed the situation in a more localized and intensive man— ner. The mailed questionnaire is also a means of reconnoitering for the sake of getting geographic differences.~ If a questionnaire is sent to a list of farm— Representativeness ers selected at random, the better educated and more intelligent types of farmers tend to reply, so that the returns do not represent a random sampling or a cross section. Besides, many, perhaps most, of the mail— ing lists used in sending out questionnaires, such as the mailing lists of colleges, membership lists of cooperatives, names furnished by county agents, are composed of selected farmers.‘ Those replying re- present a still further selected group. The farmers who would reply to a questionnaire on egg production would likely have larger and more successful poultry enterprises than those not replying. Replies to questionnaires sent to bankers, wholesalers or leading business men may be more representative of the entire group than replies from farmers. While the returns from questionnaires usually represent a select- ed group, sometimes a selected group is desired. For example, an in— quiry dealing with fruit growing may he answered chiefly by the more sucCessful commercial growars, which may be just what is desired in the particular inquiry. " ' ‘f" "- ‘ Certain types of questions referring to the community rather than the individual farm may give a cross section regardless of the type of farmer replying. For example, data on the age of horses or changes in the acreage of a crop in a community might be determined satisfactorily from estimates made by farmers as to the situation on surrounding farms. o r In the few instances reported, those who Accuracy had checked the accuracy of questionnaires had ~ found the returns were substantially correct. In a poultry marketing study, the trade pract- ices reported by questionnaire were found to check fairly accurately with the data obtained from personal interviews. In some cases the retuins had corresponded with census data, in others with data on hand collected in other ways. Returns on an inquiry as to fact— ors affecting the value of land in cut—over farms gave as good returns as the individual census schedules for similar farms. Quite frequently, however, it is apparent that while the data may be fairly accurate as an aggfigge of the farms or concerns they represent, they are not accurate ‘ for any one return to permit much analysis of variations. With a large number of returns, with no bias likely to be present, the errors y be compensating 80 far as averages are concerned; but not accurate - enough for individual record analysis-or for use in correlation studies. 1- The number of replies which may be expected Percentage from a given mail inquiry vary with the type of in— return dividuals to whom it is addressed and with the na- ture and number of the questions, from practically none in extreme cases, to practically 100 percent in case of,a small highly~selected list, such as extension specialists or managers of cooperatives. Probably an average percentage return frmn a general mailing list-of farmers is abOut ten percent. A short inquiry to bankers or a selected-business group may bring as high as 50 percent returns or more. The better educated and more intelligent the group ad» dressed, other things equal, the larger will be the returns. The more readily understood and the more readily answered the questions, the great~ er will be the returns, other things equal. " According to several replies to the inquiry, from three—fourths to 7100 percent of.the questionnaires returned to them are usable. Careful preparation and trying out the form in advance will greatly improve the usefulness of the replies. The questionnaire should be brief, usually all Form on one page including the letter. A formidable ape ' ~ pearance should be avoided. It is perhaps best not to label it "Questionnaire". The questions should be numbered for convenience in tabulation. While no arbitrary rule can be given as to the optimum number of questions, it may be stated as a general rule that the returns from most questionnaires could be greatly increased by reducing the length. The ease with which the questions may be answered, however, is im rt— ant as well as the number of Questions. VA relatively long list 0 questions each of which could be answered with "yes" or "no" or a check mark might require much less ttne for'filling out and bring larger re- turns than a shorter form with questions calling for more thought. The questions should be short and should con— Questions tain only one clear~cut idea. The words and phras— es used should be readily understood by those to whom_they are addressed. Local usage of words should be kept in mind. Trade terms should be use ~. when appropriate and they should be used with accuracy. " r all" 1'16 ';-;:.'—_.——'~-'.__ .7 -._ / tithich is more profitable, the production of apples 0? fleli “f"F3? "“s- ‘\ Above all, each question.should admit-O§~Only one possible inter— pretation, leaving no doubt_sithar_as:to-meaning or as to how it is to be answered. The following is an-ezample-of an ambiguous question; , .is was to be expected, the answers to this question were worthless. Another illustrates the extreme care that must be used in laying 0h? Questions: "Do you grade eggs for size 7 color? _" View of the answeres to.the latter part of the question were "bravo" or “Whht?”, it is also useful when possible to organize and draw up the QUtsbw ions in such a way that they furnish a check.on each othero Questions are preferable which are to be answered "yes 0T "no", Cr ,by a single figure, or check mark, or.by checking or underlining one word of-a listed series, ' When a percentage distributibn is called for, as what per cent of the money borrowed for crop production is used for fertilizers? for hired labor? for other purposes? it is well to have the per cent figures racerded one under the other and-a total of 100 per cent written at the bottom of .the column. This will help the one replying to check his_own figures. . Questions of a personal nature, as for example, the amount of cash on deposit in bank, may be practicable in a personal interview, but are to be aVOided in a mailed questionnaire. Similarly, questions involv— ing statements of failures or unfavorable returns from an enterprise are likely to reduce the percentage of returns. Poultry departments have have found that reports on egg production and mortality fall off when i the answers do not make a favorable showing. If the data are thought by the one questioned to involve a checksup on his tax payments or busi- ness affairs, he is likely not to reply unless he understands the ob— ject of the inquiry and the confidential nature of the information. In some cases heading the page with_the word "confidential" may prove helpful; Questions involving a great dea1.of_bias should be avoided unless there is some way of measuring and allowing for it. A . fMost of those who Supplied data on their use QQEE - of the questionnaires have used the frank. Such - studies have usually been conducted in cooper- ation with the United States Department of Agri— culture. The use of the frank of course reduces the cost considerably. If four cents postage is needed for each questionnaire mailed out, including a stamped envelope for reply, and a ten per cent return is received,.the cost of postage alone per questionnaire returned will.be forty cents. Such a cost seems ex— cessive when‘it is remembered that in some states the federal agrie " cultural census has been taken by local workers at less than that cost per schedule. The estimates of the costs of questionnaires‘varied from 1 to about 5 cents per questionnaire sent out where the frank was used” - ‘ , » H A ,The sources of mailing lists for farmers are Mailing . .' usually.the college mailing lists, names supplied Lists ' .by the county~agents, and cooperative membership lists. Names of wholesalers..merchants and other: - business groups may be obtained from secretaries of trade associations.,, 117 :Names of consumers and dealers are often obtained from city and tele— phone.directories. It is important to plan the questionnaire care— ggneral fully. Its objectives should be clearly outlined. Its relation to larger problems should be understood. The general make—up of the form should be tested in , - advance. The methods of analyzing, tabulating and digesting the data to be obtained should be carefully worked out in ad» vance. , An excellent plan which is used by some is to try out the prelim, inary draft of the questionnaire in a small way before sending it out in final form. If 2000 forms are to be sent out, 100 might be sent out in advance, the returns from these examined carefully to see that the information sought is being supplied and that no questions are being misunderstood. A trial tabulation can then be made of these few re— . turns as a check on the readiness with which the results will lend them— selves to tabulation. The form is then revised if necessarykand it us— ually is. Such a procedure usually results in making the returns from the inquiry much more fruitful than they would otherwise be. Experience of the Division of Crop and Liv Form of stock Estimates in securing information from farm— Questionnaire‘ are by use of schedules Zrnt and returned through the mails has indicated the unquestioned superior— ity of the printed schedule over multigraphod, mimeographed or otherwise duplicated schedules. The printed schedule has a distinctly neater appearance and gives an impression of brevity, and can be written upon either-in ink or in pencil. The percentage of returns to printed schedules has been uniformly higher than to dupli— cated schedules. Some experiments have been made by the Division with respect to the variation in the percentage of returns to different colors of paper. There is apparently no great difference in the return to a regular in— quiry by regular reporters, but in intermittent inquiries and in inquir— ies to new reporters, the color of the paper has some effect. The re— gular inquiries sent out from the Division office in Washington are al— ways printed on white paper and those sent out from the field offices are ordinarily printed on green paper. The "pulling power" of these two colors is not greatly different. If these be taken as standards, it is found that light pink is more effective with new reporters and intermit— tent reporters, and light yellow slightly better than light pink. How.~ ever, the constant repetition of light pink or light yellow to intermit— tent reporters apparently results in a considerable loss of the "pulling power", so that alternation of these colors gives better results than the use of either color continuously. ”' Where the inquiry relates.to a particular economic condition or sit- uation, it is often worth while to include a brief statement of the sit— * By Joseph A Becker, Division of Crop and Livestock Estimates. 118 nation up to the time of the inquiry. For example, in the special in— quiry made to crop reporters concerning the number of horses of Varlons ages on crop reporters' farms, a statement was included to the effect that "current information, and the trends of horse numbers on farms in the United States, indicate that the number of colts at present being produced on farms of the United States is greatly below replacement re— quirements". If the schedule is sent to a new reporter, with the intent— ion of asking regular cooperation, it has bzen found effective to eccen- pany the first schedule by a more detailed letter concerning the service, with the request that he return a blank on which he agrees to make re— ular reports whenever he is in position to supply the information re— quested. This agreement to cooperate, which the farmer is asked to sign, Chbuld not be made too binding. This agreement should read "I will at- tempt to" or " I will try to comply with the request" etc; If made too binding, or in the form of an absolute promise, he may talk at signing it. The inquiry should be brief, or give an appearance of brevity, par— ticularly tO'new reporters or to intermittent reporters. The regular in— quiries which are sent out each month from the Division of Crop and Live— stock Estimates in some cases contain as many as fifty questions, but the horizontal type of schedule utilized by the Division gives an appearance of brevity. The experience of the Division indicates that it is impossible to secure accurate answers to questions which a:e too closely defined. The reporter becomes involved in the phraseolcgy of a closely defined quest— ion and often misinterprets the question by placing emphasis upon one particular phrase rather than upon the question as a whole. For example, if information is desired concerning the number of milk cows, and too exact a definition is attempted, although this definition may coincide with the gfihmer's definition of a milk cow, it may differ entirely from the reporter's definition of a milk cow, and the answer may or may not be given in terms of the framer's definition. Our experience has indi—r cated that it is much easier to interpret the results of an inquiry expressed in general terms, making adjustments from collatexul data, than to attempt to pin the farmer down to the farmer's concept. If the farmer finds the first question involved or difficult, he is inclined to stop immediately and not make a return. Acreage sched— ules of the Division, for example, invariably ask the fanner to report the total acres of all land in his fann first, because of all acreages on his farm the acreage in the whole farm is the easiest for him to de— termine. _ Within the limitations of asking easy questions first, the most important question should be asked nest, since our experience indicates that the first questions are ordinarily answered most accurately and more thoughtfully. If the most important crops in a given State are corn, wheat, and oats, these should be asked next to the total land in farm question. . .Similar questions should be rou e to ether ' ' editing schedules indicates that fiherg €38 grder 5r 8g: giggiigficgoég pels the reporters to shift their line of thought from numbers to prices and back again to numbers, they become confused in some cases report prices where numbers were asked, and vice versa. \ 119 Each question should be complete in itself, and no qualifications on the questions implied from a previously stated group heading. If a repetition of the imnlied heading_results in an involved question, the question should be simplified. This is one of the commonest faults in schedule preparation against which even experienced schedule makers have constantly to guard themselves. An example of such a schedule is shown below: - . 1. Acres of all land in farm. WINTER WHEAT 2. Acres planted on your farm last fall. 37 Acres harvested on your farm this year. 4. Acres planted on your farm this fall. To the schedule maker it is obvious that questions 2, 3, and 4 all re— late to winter wheat, but the results of the inquiry invariably show a considerable percentage of farmers who report question 2 accurately re— lating to winter wheat, and questions 3 and 4 inaccurately by report— ing acreage in all croos. In this particular inquiry it is entirely pflssible, because of the nature of the questions to edit and discard those schedules whereon questions 3 and 4 are incorrectly answered, but many schedules similarly boxaheaded do not permit of such editing and the results correspondingly contain an unknown amount of inaccur— acy. ' ' While it is essential that as few questions be asked as possible,‘ in some cases where there is a possibility of a misinterpretation it is necessary to inelude an inquiry on subject matter Concerning which no information is deSired, in_order that the_question concerning which in- formation is desired be answered properly. An example of such a quest— ion is found in the pig surveys, where it is desired to secure a com— parison of the number of sows farrowing on June 1 this year and on June 1 last year, and also the average number of pigs per litter this year, the number of pigs per litter last year having been already determined by this same inquiry a year ago. The logical order for the three quest— ions is as follows: ‘ 1. Number of sows which farrowed on your farm between December 1, 1927 and June.l, 1928. 2. Totfil number of pigs saved on your farm between December 1, 1927, and June 1, 1928." 3. Number-of sows which farrowed on your farm between December 1, 1926 and June 1, 1927. Because of the juxtaposition of the inquiries, it was found that a con— siderable inaccuracy was.involved because some farmers in answering question 3 gave not the number of sows which farrowed but the number of pigs saved. AccOrdingly, it has been found necessary to include a fuurth ~ questiOn on the number of.pigs saved "Number of pigs saved from all litt- ers farrowed on this farm between December 1, 1926, and June 1, 1927"f“ 120 Because of the selectivity of sampling, which is discussed else— where in this report it has been found expedient to request certain judgment replies to inquiries which themselves throw some light upon the skewness of the sample. For example, a farmer reporting the pro— duction of apples on his farm this season and last season may show 6 some increase in production this year over last year. A Judgment in— quiry as to production in his locality compared to last year might show a material decrease. The apparent disagreement between the two series might lead the farmer to volunteer‘the information that his or— chard is made up largely of Baldwins which had an off—season last year and an on—season this year. Many of the inquiries of the Division are made regularly each month during the growing season and year after: year. Experience in editing returns to inQuiries have proved the value of using identical phraseology in successive inquiries in order that comparability may not be destroyed. EVen the use of bold type for setting out the important part of the question is carefully guarded against change.v Experience has shown that the answers to the question "Number of Pigs Saved" are not exactly comparable to the answers to the question "Pigs Saved". or A Judgment inquiry concerning the proportion of the cotton crop ginned to a specified date has been carried in this order on the echo—- dules sent to one list: "What percentage of the total cotton crop in your locality has been ginne.d to December 1?" "What percentage of the total cotton crop in your locality has been picked by December-l T"; and on another schedule in reverse order. The results of the two in— Quiries vary by a number of percent, and the variation tends to be con— stant. In its work of following up the schedules sent to a reporter, the Division has found the most effective follow—up method to consist in sending a duplicate schedule with "Second Request“ stamped directly 3— cross the face of the schedule in the upper left~hand corner. In some cases the returns to such a follow—up have actually exceeded the orig- inal return. 'The. use of a special letter to accompany the second re— quest has not proved nearly as effective. The most effective method of securing faithful performance on the part of regular crop reporters has been a mere slip of paper on which is printed in large type "WE MISSED YOUR REPORT LAST MONTH". The appeal of this brief phrase has exceeded in effectiveness all circulars, form letters, or personally ‘ signed letters, so far devised by the Division in maintaining the co~ operation of its regular reporters. Timeliness of the receipt of the report-by the reporters has been " found to be extremely important in securing returns. The most effect— ive results are secured when the report is- placed in the farmer' 3 hands about two days before the date to which the report is presumed to re— late. A report of concerning the condition of crops on July 1 is more largely returned if received by the farmer about the 27th or 28th of‘ June than if actually received on the first of June., There appears to be a disposition on the part of farmers, and this undoubtedly is a.trait of humanr-beings in general, to out off making a r‘eply immediately upon receipt of an inquiry. If, however, the reply is put off for more than - tWo or three days, the tendency is to .forget it entirely. Returns to J questionnaire which bears no date on its face has also been found to' give indifferent returns. It is, therefore, apparent that all the re- ports calling for information as to current conditions should be dated, and a statement included as to when the reports should be filled in and returned. The general practice followed in the Division in formulating a new schedule is to put down all the questions which seem essential to the inquiry and then begin a process of elimination to reduce the number of inquiries and the number of words used in each specific inquiry to a minimum. If there is some doubt that the questions asked will be under~ stood and answered accurately by farmer reporters, it has been found very fruitful of good results to send out a trial inquiry several months before the contemplated date of the fu11.inquiry to various typical sect— ions of the United States. A careful analysis and Close study of the in— dividual reports is'then made of these trial returns. Repeated trials of this nature have indicated that misinterpretations vary between sections of the country due to variation in farm practice in different parts of the country and also due to slight variation in the local use of certain phraseology. Fbr example, "pigs saved" in Iowa apparently means the num— ber of pigs which were raised to weaning age, but ”pigs-saved" along the fringe of the Corn Belt means pigs which the farmer raised and did not sell to his neighbors for fattening. It is generally reocgnized that Questionnaires Sampling obtain returns from a select list of correspondents. The frequency distribution of_sizes of farms given in Table l in the section on the survey method pro- bably gives as good an illustration of this as one could want. The questionnaire farms averaged 157_acres as compared with 128 for the survey and 65 for the census.' The two groups of smallest— sized farms were almost entirely left out of the questionnaire returns. It is usually assumed that because of this the data obtained from questionnaires has little authentic1ty, Such a statement presumes that a study can have only one objective, namely, to furnish an accurate de— scription of Conditions. It has been made clear in an earlier discus~ sion that more description is a low order of research, If the question— naire returns could be depended upon as accurate, they undoubtedly would lend themselves to a great deal of research in which the objective is the analyzing of certain sorts of relationships, which may be nearly as authentically revealed in a group of the larger and better farms as in a well distributed sample. Another objective of research is to measure change, and it is probable that this is represented more nearly correct— ly by a comparison of the.same sample at two periods, even though it may be biased, than are the_absolute conditions at any given time. However, one cannot be sure that the bias does not extend in some degree to changes and relationships; hence one is always more or less on uncertain ground with a biased sample. In the list of uses of the questionnaire method outlined by Profess— or Pope are a number which do not require a representative sample even to the extent that the two just mentioned do. An inquiry directed to discov- :ering what leasing systems maintain in a certain territory, or what land clearing practices are followed in.a timbered region, or what practicer§7 .r A, . > -_ fl 1. v "‘ _ . V; V . . \ KN.— W .- " . . - - .. . . _,’ ~ . ~ 1 are followed by the trade in handiing certain commodities, does not re~«‘*- .quire a highly representative sample. \It~islnon geocming more and more .realized that various sorts of data must be kept -cp—to~date. Increas- ing use is therefore likely to be made of questionnaires to check up or rh£.ngcs cf various kinds in the details of farm practices, in varieties J3? crops planted, in dairy cow rations, in croo rotations, and the 11;~ Mr; S. W. Mendum of the Bureau of Agriculture: 3_X, ' - Economics has made a comparison of the summaries o‘ 101 fa.rm account books for 1926 in the state of with 493 questionnaire returns from Ohio farmers +> ' that year. -Unfortunately, the samples are so m ch different , the book records being mostly confined to a few counties, ‘hat no significant comparison can be made. The questionnaire returns show 3d. a wider range than -iid the book records, 8 per cent of- them being belor AC (3 L' ‘zero, and 4- per cent of them above $5 000; Mr. Mendum speaks of the difficulty over the inventory returns in the mailed questionnaires. He -says: "Few of the corresponding inventories for the same farms, .which ,should be identical, are even approximately the same, though the average works out someWhat closer than the average of all farms in the state sam- ples. We have accepted increases each year on the theory that farmers have really been adding to inventory yearly since the depression years through better crops, imorovement in prices and expenditures for equip— ment; but this cannot go on indefinitely without a more substantial veri— fication." The book records showed an increase in inventory from 1926 amounting to $253 compaied with $162 for the mailed schedules; hence, the Ohio book records threw no light on this problem. , Another-item in doubt in the mail inquiry is "miscellaneous expenses not otherwise reported above", which is too often zero or nominal. For .the mailed questionnaires the miscellaneous expenses averag-ed $186, or 16.6 per cent of the total expenses, compared with $424, or 27 per cent for the book records. Mr. Mendum's general conclusion was that this comparison-did not change his general opinion "that the showings of the mail schedules are reasonable and conservative, and an undetermined degree higher than a census average." Table 1 compares the returns on the survey and questionnaire farms in North Carolina. Although the questionnaire farms are two and a half times as large in acres,.both their receipts and expenditures were a.lit— tle less. Evidence later nresented indicates that the tax figures are nearly accurate in‘the questionnaire returns. Obviously, therefore,'the . questionnaire farms are nowhere nearly as much larger than the survey farms as their acres would indicate. Even allowing for this, however, the questionnaire averages seem to be too low. The discrepancy is most notable in the two miscellaneous items. ' 4 .1 123 Table 1.— Comparison of Questionnaire and Survey Returns, North Carolina, 1927. Survey Questionnaire IBeceipts SuIVey CropS.....'......‘..................$1,339 ;.'.... ....... $1.105 Livestock & livestock products.... 234 ................ 378 Miscellaneous...g.....' ..... ‘ ....... 101 ... ........... .. 26 Total..f.....,.$1,674 . ............... $1,509 .EgpenSes Livestock bought............‘...... 33 78 Feed-o‘eu-ca-V...udu.n...‘...Vo-oaoa-‘. 71 ou.-v-o.‘..’..--on 82 Fertilizer... ...... ...;.L..; ..... '. 208 .. .............. 188 Machinery.; ..... r; ..... 1....; ..... ‘. 20 ................. 47 Improvements.....;..... .......... , 83 °--a ............. 102 Taxes. ..... 2 ........ ....LL...1....; 97 ....,............ 115 Other Expenses.. ...... .1...{ ..... 3, 194 .. .............. . 136 Labor..... ....... ' ........... u....L 376 ..... ......... .... 296 Total....i.....‘_.$1,o'7‘7 ......... .. $1,048 Value of farmer's and femilyi ‘ - 1abor.._ ..... '- $ 912 ........... $384 Size of farms, acres .......£....i 65 ...... ..... V..... 157 _ After the survey had been made; it was discovered that seven of the 1115 farmers had sent in'a return on the mailed questionnaire. Table 2 puts six of these returns-in deadly parallels. The seventh one is omitted because the farmer reported only 60 of his 260 acres_ in his questionnaire return. The sixth one listed was discarded in the editing of the questionnaire returns because of the low valuat— ions on;the real estate.4 The survey disclosed that all that happen— ed was that a cipher was omitted. :The tax figures are exactly the same in only one case, but nearly the same in two others. The esti~ mates as to the value of_the food furnished by the farm and consum— ed'by the farm family are widely apart. ~Tota1 receipts and total expenditures are quite far apart in most cases and the individual i— tems are even farther apart. .Tho real estate valuations are widely apart in some cases. It-is difficult to believe that this is not an exceptional set of parallel figures. They are included because they are suggestive rather than conclusive. 9! 1“_ ‘. ~-\‘ 124 FahTe 11.— Comparison of Survey and Questionnaire Return on Six North Carolina Farms, 1927 '"—"’-_‘_"1-—"—'_.f“_ .1 ; B : C - : D = E :_u _;5 _____ Vin§t;r-: 11—: Sur—: Q1— :Sur—:Qu~ : Sur—: Qu- :Sur-:Qu— .So.z- : Qu ;_§9y : £33.: 19v : est. :vey :est. : vey : est.:vey :est :19Y_;_§St _; E :3.~::: . . . . = : : . Te~ul.1l. ....;$2291: $9700‘32O34: $2120: $154: $250:$5419: $2867: $258: $voc-9§7*~$500 :99: ........ .. : 78f8: 9400: 1978: 2010: 10: 50: 5311: 2715: 258: 600 ’ T —— f%-:z.k:..::....: 50: 200: 30: 40: 56: 100: 45: 62: ~— ——. 1 —- " Ernaucts...: 113: 100:., 26: 40: 88: 100: 63: 90: ——: 100: —- "igjgiganeo1s... : __, -1- _- __ __‘ -_' _-; -- __: __ _— ~— Larn Exgenses : ' : ,. : : : : : ' : : Tobal.........:$6154: 5234: 2981: 1863: 348: 207: 3392: 1287: 276: 410: 560: 500 ,1c;r...... ..... : 2937: 2150: .925: 896: 15: .25: 1485: 150: 64: 150: 80: —— .1vestock bought..: 100: 225: .20: 20: ~—: --4 10: 'f-:' ——' -- 6 ~— 1934 ” : 205: 115: 96: 75: 17: ~10: 87: ‘10: 40: 20: 31: ~— 1Erbilizer " : 863: 925: .569: 362: 33:» 30:.1050:‘ 650: 60: 60: 130: ~— “Fed " . : 39: 5: 28: 8: 10:' 20: 48: --: 2: -— 21: —— Taxc s ............. : 421: 545: 72: 72: 56: 72: 187: 190: ’77: 75: 165: ~— NEW Machinery..... 20: 120: 111: 20: #4: 15: 55: 55: ~-: -—: -— —~ N .w Buildings..... 225: 600: 425: 200: 60: 25: 150: 200: ——: ~—: -— ~— xterest ......... . 582: 425: 315: 360: -~: ~—: 30: -—: ~-: 105: ~- ;;,re11an ou.s..... 762: 125: 420: 50: 57: 10: 292: 32: 33: --: 127: 400 ‘liJnLory : ‘ ..:. 7 - - ; : . : . . . 0*ops. ........... . 1111788130015. . u u I Machinery.. ..... .. Supplies......... Real Estate...... End: ”Total Livestock......... Machinery.. ..... .. Supplies.... ..... . Food from farm.... Farmer 's and family labor... égpes..;, ......... . :eginning, Total: Crops.... ........ . In I. on - o. to co a. on 2959: 6700: : 1200: 944: 16002 1800: 3500: 215. 400' 28658: 28500: 2991: 6425: : 1100: 906: 1800: 1700: 3000: 385: 525: 281: 600: 150: 1200: 253 243: 650: 125: 400: 125: 8000: 575: ‘0 175: 400: 150 dc .- uc 0- 200: 100: 975: 768: 125: 82: 3800:4000:10000: 1800: 500: 800: 500: 1072:1900: 776: 125: 171: 400: 1000: 500: _-: c. u. I. .0 sq .- H U m 1534: 440: 775: 319: 1397: “B”Gparing:0rop16nd:Yield :& Tear on:0verhead :Cas 0 Rent: 031:1: Per Acre 3 Acre :0000nd :Sells‘ :Fer ' :Implen'ts: Cost Per Per :1on 10rt33%U1't° ___ :Fq; Seed :Fcr :Acre :Eer..Acre :Acre :Acve‘ " :Acze :Market ‘151 HO : 00st Nozfirmce No: Bu. No: Cost No :Cost :Nor'nent Nd: Cost No: COM NO .50 2 :$ .50 1: $20.11 1'51:31-10..6:$.10.~4:-$2.5o 4:6 .20 31$. .10 6 .60 1: .80 1: 50 '1: -20. 1,: .20.. 8: .20~ 5: 5.00' 6: .50 ,6: .2'0' 4 .20 2 : 1.00 2: 35 1: 25 3:. .25.: 8: .25 -'6: 3.50 "1:' ./0 _'1;...25 10 1.00 :3 :1 .50 1: 4o. 3: 28 1.: .30-2: .30 2: 4.00"'3: ‘.80 2: .30 1 1.25 l : 2.00 4: 45 1: 30 9: .40. 4: .40 : 4.50: ;2: 1.00 16: .40 :2 ' 1,50 7 : 2. 50 1: 50 4:32 1: .42 1: .50 17: 5.00 18: 1.10 1: .50 15 1.60 1 : 2.70 1: 60 3:35 5: .50 ll: .60 1: 6.00 17: 1.20 1: .60 3 1.70 1 : 2.83 l: 70 1:37 1: .80 5: .80 2: 6.50 1: 1.25 1: .80 . 3 1.80 3:: 3.00 -11: 75 6:40- 22: 1.00 26: .90 1: 7.00 9: 1.50 8: 1.00 20 3.00 11 : 3.40 l: 80 8:42. 3: 1.20 1:1. 00 20: 7.50".2: 1.70... :.l.25 3 2.25 4 : 3.50 4: 85 1:45 6:-1.40 1:1.11j 1: 8.00 15:11.80. .2: 2.50 5 2.50 6 : 3.80 2: 90 4:50 24: 1550 6: 1.50, 3: 9.00 1.2: 2.00 ”1.2: 3.00 1 3.00 16 : 4.00 12: 100* 21:52 2: 1.80 2: 1.80‘ 1:10f00 15; 2.40 .1:.4.00 1 3.25 1 : 4.25 3: 105 1:55 34-2.00 8:2.00 ‘ 3:12.00 _3: 2.50.. 8: 5.00 5 3.40 1:: 4.50 1: 110 1:60 9:-2:40' 1: 2.50 ' 1:15.00 1: 2.80 2: 9.00 1 3.50 1 : 4.60 1: 115 1:65 5: 2.60 1:3.00 8:20.00 2: 3.00 19: 4.00 7 : 4.65 1: 125 5:70 2: 2.80 1:4.00 3:20.20 1: 3.25 .1: 4.50 1 : 5.00 17: 140 2:75 '1: 3.00 1:5.00 _ 3:25.00. .1: 3.50 3: .60 1 : 5.25 2: 150' 15:80. .1: 4.50 lzno ' : ' : 3.80 ..1: 5.00 9 : 5.50 3: 175 1:90 1: 5.00 2:reeord 2: 2: 3.92 . 1: 5.85 l : 5.80 1: 200 9: : .. : : : 4.00. 11: 6.00 4 : 6.00 13: 250 . 1: : : : :_ 4.25 1: 6. 25 1:: 7.00 5: 500 2: :. : ': 1:41.40. 1: 6. 50 1 ; 8.00 3: : : : : : 4.50 4: 7.25 1 : 9.00 3: : : : : .: 5.00 12: 7.50 1 : 9.50 1: : : : : : 5.50 3: 8.00 4 :10.00 2: : : : : ,: 6.00 7: 9.00 1 : : : : : : : 6.25 : 0.00 2 : : : : : : : 6.80 2: 0 rec. 2 : : : : : : : 7.00 3: : : : : : : : 7.30 1: : : : : : ': : 7.50 2: : : : : : .1 : 8.00 11: : : _: : z : ‘ 9.00 1: : : : : : ': = 9.50 1: : : :_ : : : :10.00 ' ~: : : : : = i 311.40 1: : : = : = : =12.oo 1: : : : : : = =1$00 4: : z : : : : :20.00 3: (5) EiTfi FROM REPORTERS. One of the important pending developments in research in agri— cultural economics is the expansion of our machinery for collecting data of current changes and analysing them. The only agency that has had extensive experience with collecting such data is the Division of Crop and Livestock Estimates of the U; S. Department of Agriculture. Its wealth of experience in this field would reQuire a year's work to exploit and a book to present. All that can be done here is to indicate the high points. - The following excerpts from an article by Mr. W. F. Callander, Chairman of the Crop Reporting Board, under.the title "The Crop Reporting Service of the United States Department of Agriculture", outline the principal features of the methods of obtaining the data: (1) The township and county reporters, comprising one farmer re— porter for each agricultural township in the United States, number .slightly more than 40,000. Each t0wnship reporter is asked to report monthly for the crops and livestock in his neighborhood, using printed schedules, which, after being filled out, are mailed directly to Washingu ton and are there tabulated and summarized. In arriving at state aver— ages, the township reports are weighted by districts, each state being divided into from 7 to 9.crop reporting districts. Each district is given a weight in the state total proportionate to its relative importance. '(2) Field Aids, of whom there are more than 40,000, are asked to report the agricultural conditions within their observations and knows ledge monthly, forwarding their schedules to the State Agricultural ,Statisticians in the various States. They are distributed similarly to the township list. Theoretically each township is represented each month by a re- port from the township crop reporter, and presumably another report from the field aid in the same township,-the first sending his report to Wash- ingto and the second to the State Statistician. Actually many townships do not have'a reporter on one list or the other, or perhaps on both, and _ only about a 50 per cent return can be expected any one month from either list. -, ' ' (3) Special price reporters, numbering about 15,000, report monthly to the washington office the prices received by farmers for their pro- duce. Another list of retail price reporters, numbering about 10,000, are asked to report on prices being paid by farmers for seed, feed, machin- ery, fertilizers, and other equipment and supplies. Both lists are made up mostly of merchants and dealers. 4 (4) Mill and elevator reporters are asked to’report periodically""/ to the field offices on stocks of grain on hand, There.@IS-ahout-28+DOO of these reporters. , (5) Individual farm renorters, numbering about 100,000llarslsent’ schedules on which they are asked to report only for their own farms. Eiwir reports constitute.an actual farm census for.their individual .1 5.13.3 . 0 (33 CD \.I I ‘ K , ‘Tho field offices have a similar list of 100,000 farmers which th;y u“ . intermittently. _(7) Three times a year, once on acreage (September) and twice on 'snine and'other-live stock (June and December), about 200,000 reports .91 individual farms are gathered through the rural carriers of the Post Office Department.- The‘Semi—annual pig surveys_made as of June lst and December 1st were inaugurated in the current‘hog production and intent— ions as to future production by the individual sample method. Cards ‘ fare sent‘to‘each.postoffice in the United States, to be distributed by ‘ the rural carriers to the farms-along their routes, and direct to star route contractors in sections where most of the rural‘mail is distributed by them. . ‘ , - l . -Each carrier is supplied with-ten cards, andis instructed to ‘distribute these among his patrons so that he will get returns from all kinds of farms ~ in other words, a fair sample of the farms he serves. He is‘instructed to get the information from the farmer himself and fill out the Card, or else to leave the card with the request that it be filled out and returned to him promptly. These cards when returned are sent by the postmasters direct to the Department of Agriculture for tab- 'ulation. In addition to the foregoing, a number of special lists, num- ‘ bering many thousands, are maintained of reporters on special crops such ‘ as cotton, tobacco, rice, potatoes, peanuts, truck and fruit crops, who .report either directly to Washington or to the field offices for the par— ticular crop in which they are interested and concerning which they are regarded as the best source of local information. The three major problems with which the Bureau of Crop Esti- 'mates has to deal are bias in the reporter's estimates or judgments, selectivity in the sampling, and fluctuations in the sampling. Let us consider them in the_same order. Mr. C. F. Sarle's* discussions of bias in estimates followsz' Where the farmer's chief source of income is Bias in from one product, he is likely to be ultra conserv— Estimates. ative in his estimates of yield per acre or condition. This type of bias_occurs in a varying degree from year-to year and it is not safe to assume that it is constant. To some extent it is offset by the fact that our reporters are usually better farmers than the average. ' ' * Division of Crop and Livestock Estimates. i": I// / 129' f This ultra Conservative type of bias with yield;por-acre samples may be much more important in a year of heavy production, when prices are very low and there is a general dissatisfaction among farmers be~ cause of the low price. This kind of bias might be called price con- scious bias. In the fall of 1926 the price of corn was low and there was considerable newspaper publicity concerning large crops and low prices. The corn yield reports from Iowa were fully two bushels below the yield of corn reported to the assessor in the late winter and early spring. The last few years it has been our practice to send out a cotton yield inquiry on March 1st asking for the yield of cotton for the last crop. There seems to be much less conservatism bias in this inQuiry than in those made earlier in the season, as invariably the yield is reported higher than on December lst or earlier in the fall. Apparently, as the crop leayes the farmer's hands, there is a tendency for the re— ported yields to be higher. With grain crops in Iowa, the opposite tendency is usually evident, due perhaps to shrinkage, or perhaps to failure to remember, which may be called memogy bias. Mr. John B. Shepard has summarized the experience of the Division with memory bias as follows: 1: When there has been but little actual change from one date to the next, there is ordinarily relatively little memory bias in reports on most items, such as horses, cows, acreage of corn, wheat, etc. While there are many individuals who do not report the correct figures for the preceding date, the number who report too many is largely offset by the number who report too few. Ordinarily the memory bias on wheat, hay, horses, or cows is less than 1 per cent. 2: When a considerable period of time has elapsed, the errors of those remembering too many is much less exactly offset by the errors of those remembering too few, Thus, during 1927 the number of cows in New York State increased slightly, and instead of the usual memory bias, half of one percent too many, reporters remembered 1.8 percent too many. During the same year the number of horses declined, and in reporting for the previous year, crop correspondents remembered 2.1 per cent too few. 3: Memory bias is much greater in items of minor importance, or which change materially from month to month. For example, in reporting poultry on hand a year previous, crop correspondents apparently remember too few by 6 or 8 per cent. In the case of catch-crops, or others which are grown only occasionally, the memory bias is often heavy, 4: Mbmory bias is much heavier in the case of items for which there is no.exact definition. Fbr example, the memory bias on all corn as a whole is probably between 1 and 2 percent in New York, but the memory bias on corn harvested for green feed or fodder only may exceed 20 per cent. The bias on oats is usually about one—half of 1 percent, but the memory bias on grains cut green for hay is probably around 15 percent. $5 130 5. On certain items there are certain twists of memory which are SO frequently encountered that little reliance can be placed on the 7 hemory ef.‘ crOp reporters ”” flmong these are such items as dairy heifers and other young stoC’k, including colts, lambS and pigs. If there has been a freeze or a hail storm that has caused serious m«ury to a crop, the reports from gTOVOla 1“‘11»ntly over—estiEate the da:flg9 Ifweather cenditions are Ipazticularl; i1sronraging, the returns a: e iikLly to reflect this temporarv d.i scourw~onr st. Fie .ld travel and pois1nal contacts in producing aicas are esszntiaL 1f weather bias is to bra a.Llowed for currently. ' - V V The most significant aspect of this discussion is the tendency of . the bias to vary from year to year, which might be discussed and illus— treted at. greater length if Ispace permitted .1 1: ”.1 One of the most important uses of sample data, Selectivity ~ ‘ 'is in connection with the determination of acreage. of the Sampling*' A brief examination of the returns from almost any inquirY'sent --to’the crop reporters of the Division at once convinces us that the farmers who report to. ' " us are not representative of the entire group. The individual farms reporting are larger in all states — Iowa, 288 acres as compared with 157 for the census; Alabama 322 acres as compared with 76 for the census; Wiseens1n 155 acres as compared with 117 for the cen- sus. Along with differences in sige are other differences which may be summarized as follows: ‘ l: The sample generally shows a hi. gher percentage of farm land in crops. 2: The sample generally shows more head of livestock per farm. 3: Acreage samples generally show greater percentage of feed and hay crops than census. In other words, ”there is distortion within the sample. . ' - 4: Rural carrier surveys show less of this lack of representativeness than mail returns. 5: Samples often fail of geographical representativeness — diSpro- portion of returns on irrigated farms too .few larger plantations from delta cotton region; too few“ from important tenant areas; too few from marginal farming areas. 6: It can be argued a priori that condition reports are more repre— sentative than almost any other data with which we deal. The factors causing deterioration- in crops to a considerable extent prey alike upon. the good and the poor farmer. The better farmers making- -up the crop reporters of the Divi ion therefore represent all farmers insofar as condition reports are concerned. * Mr. J. A. Becker, Division of Crop and Livestock Estimates; 131 It must be remembered further that our condition reports are inter— preted into yield per acre on the basis of past relations of condition to yield. If there be a constant bias, it is cancelled out on both sides of the equation. 7: The census gives a variable check on yields, but generally our reported judgment yields exceed census yields somewhat. 8: Yields on the individual farms reporting'usually exceed those based on judgment reports. Mr. Sarle mentions some other interesting aSpects of this same selectivity: Any schedule that relates to a particular crop tends to bring out replies from men who are particularly interested in this crop. It is frequently desirable to have questions of general interest as a "lead" on the schedule in order to secure a more representative sample. The acreage surveys are highly selective because of the length and difficulty involved in filling out one of the schedules, especially when acreage for both this year and last year are requested. Only a high type of farmer is capable of filling out some of the complicated inquiries sent out by some divisions of this Bureml. If there is wide—spread propaganda to reduce cotton acreage, the replies to an acreage inquiry are likely to cone from the public—spirited farmers who are actually reducing their acreage while the others failed to report. This is frequently a very serious source of bias. " It is fortunate," says Mr. Becker, "that our first attempts to utilize sample data in the determination of of acreage in crops and nwnbers of livestock were in connect— ion with the change from year to year, Checking of the percentage of change as shown on the two—year report a— gainst such annual ezumorations as are available indicates that while the individual farm reports are not usable as absolute ratios, they are usable as showing the change from year to year. In general, there is strong evidence that the reporter’s change is rather representative of the change of the entire group. Certain measurable biases have been noted in the data, e. g., a general plus bias in the Northern states, and minus in the Southern states, particularly with reference to cotton. There is a possibility that the change alert by certainxsizergrbupings may better represent the change of all farms than does the entire Sample. A study of his nature is now being made in the washington research section" 132 The Question frequently arises in the minds of Fluctuatiops ip research workers as to whether the general theory Samoling.* of saiqling as developed by Karl Pearson, Yule and Bowley, is really applicable to the sampling work as done by the Division of CrOp Estimates. In any given sample we have dispersion by various factors, such as errors of observation, geographic differences, climatic differ- ences, etc. The dispersion in the farm price sample of cotton in a given state is caused by errors of observation, differences in price due to differences in quality and grade, differences inprice caused by distance to market, as well as differcrces in local competition. When the whole United States is being sampled, the range in price for any one of these factors may be very large. The distinction between inquiries made periodically and surveys made only once or twice, and then at no regular intervals, is also im— portant. With periodic inquiries it is necessary to build up and main- tain a list of correspondents or reporters to whom the inquiries are sent. With periodic in tiries we are frequently fully as interested in the comparative results of the two samples as in the results of the inquiry itself. The very fact that we have the same list of correspondents reporting for successive inquiries tends to strengthen validity of the comparison between the s umles. With both the monthly farm prices from price reporters and monthly crop reports from farmers, it is not at all unusual to have 50 to 60 percent of the returns in a given month from reporters who had replied for the month previous. The type of frequency distribution obtained with a given sample affects the dependability of the average to some extent when the dis~ tribution departs materially from a reasonably normal or symmetrical distribution:. Fortunately a U-shaped distribution seldom occurs, but _a J—shaped distribution is characteristic of the "grain stocks on farms" "inquiry at the end of the crop year, and of acreage abandoned. Con— dition or a drop in percent of normal is heldom reported much above 100 %; and as a result an average condition of 90 or higher is likely to be obtained from a sample that is badly skewed with the “tail" to- ward the lower values and only a few reports above 100. With such . badly skewed distributions, it would probably be better if we could use some other measure of the central tendence of the frequency distri— bution than the arithmetic mean, but the time allowed for getting out a report prohibits the use of any other. This tendency of the average to depart from the mode is a serious matter with data that include a large proportion of zeros, and should be taken into consideration in connection with the use of such averages for making estimates. The sampling method adopted by the Division of Crop and Livestock Estimates is not that described in the textbooks as simple sampling, but instead the stratification of the universe suggested by Yule and Bowley. We stratify our universe into counties and crop reporting districts and draw reports - observations — from each county. As a result, the fluct- uations of sampling are less than if we trusted to pure random sampling with our present facilities. In building up a list of reporters on crop *The rest of this section is abridged from a report by Mr. Sarle. 155 yields, we endeavor to have one reporter in each township in the state, at least several in each county. This is much better than if We took the names of a few farmers out of a hat. It avoids the "bunching of reports" that accidentally occurs with small samples and insures a better geographic representativeness in our sarple, and in addition reduces the probable error of the mean. Not only do we stratify our universe on a county, and frequently on a township basis, but we group several counties into a crop-reporting district. The crep-reporting district average is then weighted by the relatiVe importance of the particular crop in that district. In some states, such as California, it is necessary to weight by counties be— cause the conditions vary greatly betWeen counties. To the extent that our crop—reporting districts are more homogeneous, that is, show less dispersion, we are able to reduce somewhat the influence of the fluct- uation of sampling by this weighting process. The practical application of this Would lead to a special districting for each ihpcrtant inquiry in-a giVen state. These districts might easily resemble the "gerry— mander" districts of politics. We are already initiating a program of re-districting the various states for the farr price work. This work has already been done in Colorado and is underway in North Iakcta. With the highly variable acreage sanples, it is desirable to group our farms on the basis of the size of the.farm, using the census classi- fication in order that farm area acreage Weights may be available for weighting our ratios. The same is applicable with out inquiries concerning the numbers of livestock on individual farms., In the state of Washing- ' ton it was pessible to re—district the state into three reasonably homo- geneous areas. ' ' - The acreage schedules are further grouped on the basis.ef the size. of farm.} As a result of this size—grouping, or stratification,'the disperSion of 4he size—group as compared with the district group Was decreased almost enough to offset the fact that there were only about oneefourth as nany farms in the Size group as in the district. 'The prob— lem of dietricting a state for a special inquiry is of vital importance in the future development of'this Sampling work. 'Wherever possible, homogeneous areas should be made intb crop-reporting districts. It will be necessary to increase greatly the size of the sample in the more heterogeneous areas of the state. - A further-nethod chstratification is to divide our question into several subdivisions, the averages of which can later be combined'into a figure for the whole by the use of special weights. Apple prices by varieties and grades would illustrate this kind of stratification. Data of quantity of sales-by varieties and grades would be needed as Weights .with the prices by varieties and grades if an average price for all apples sold was to be obtained. A test of how Well either geographic stratification or question- stratification really reduces the dispersion is to compute the standard deviation of the data by districts to see if it is less than when'all of the data are thrown together in the usual way or when present crop reporting districts are used. Question districting has worked splenéfif didly with the farm prices of beans and peanuts. . ‘134 - -; \‘\\, . Since a.great deal of our Sample data is used 1n,a:rolative sense and the difference between two averages_is.really more important than the averages themselves the probable error of the difference or of the ratios becomes of prime importance. The‘actual computed figure for probable error of a difference or ratio can be materially reduced by the use of paired reports or identical reporters. The more identical the sample and the larger the positive correlation between these re— ports, the less the computed figure for probable error of the difference. The actual significance of this, of course, is one thing when change is being sampled and another when absolute conditions are being sampled. Where averages are being used for comparative purposes, the thing to do is to increase the number of paired reports from identical corres— pondents for any two consecutive inquiries. We can alWays decrease the probable error of our sample by increas— ing the number of reports. This is very important if we are attempting to make averages for small states or for areas smaller than a state. ‘ To what extent the sampling methods outlined , Dispersion meet the conditions of simple sanpling and validate the use of the measures of sampling based on the assumptions of simple sampling, it is not possible to say. It seems reasonable to assume that they Justify one in believing that dispersion and the size of the sample, the two components of Probable Error of the Mean, affect the representativeness of the sample. There is a great difference in the dispersion of sanple data depending on the kind of inquiry, whether it be a sample of farm prices, yield per acre or acreage.- It is impor- tant that the statistician know within what limits he can expect the dispersion of his sample data to fall. The greater the dispersion, the larger the number of observations or reports needed to produce a given probable error of the mean. To illustrate, a sample of farm prices of wheat in Kansas might easily have a coefficient of variation as low as 5 percent. With this dispersion, assuming the probable error formula as entirely appropriate, a sample of 46 reports would result in a rela- tive P. E. of the mean of 0.5 per cent. This is a highly stable aver- age and not materially affected by the fluctiations of sampling. If our sample had a diSpersion of 30 percent, to give us the same relative P. E. of the mean of .5 percent, it would require about 1638 reportS. With a dispersion of 50 percent, about 4,550 reports would be needed; with 100 percent dispersion, over 18,000 reports would be re— quired, and with 300 per cnet dispersion, over 163,000 reports would be needed. Unfortunately, however, there is a very definite limit to the number of reports that it is practical to obtain by anything short of a complete enumeration. The farm price samples by states have the least dispersion of any of the inQuiries made by the Division, ra.ging between 5 percent or less and 10 percent vith farm products that are sold from surplus—producing states and on well organized .orkcts, such as cotton, wheat, corn, flax, and hogs in surplus~producing states. Apple prices are the most variable, with a dispersion seldom less than 30 percent to as much as 40 percent. \. 135 The dispersion of the retail price samples collected by the Department,x" is seldom less than 8 percent,'even with staple products such as sugar, and flour, and only with commodities such as socks and limestoneddoes the coefficient of variation exceed 30 percent. .Farm wage samples show a dispersion ranging from about 25 to 40 percent in most states. The monthly condition figures may have a disPersion as low as 10 or 15 percent, and on the otherhnnd'as high as 30 or 40 percent in some states. Even in the state of Washington, where there is a high degree' of geographic variation, the coefficient of variation for winter wheat in both 1926 and 1927 held at approximately 15 percent. Since condition is expreSSed in percent of "normal" condition or normal probable yield per acre, and since each locality can have a different "normal" concept, it is to be expected that the condition samples would show much less dispersion than yield per acre samples; and.that is the case. Yield per acre samples seldom have a coefficient of variation less than 25 percent, and frequently the dispersion reaches 90 or 100 percent, v or even more, as-has happened in Oklahoma-where the cotton yields may _vary from a few pounds to several hundred pounds per acre. Land value figures show about the same range of dispersion as yield per acre samples, as they may be as low as 25 percent in Iowa and as high as 100 percent in California and some of the other western states. Acreage samples show the greatest dispersion of all sample data un- less it be numbers of livestock on farms. The dispersion of the acreage per farm of the different crops varies from as low as 60 percent in Mary- land to over 300 percent in the state of Washington. ‘ About 7,000 farm price schedules are new tab— Size of . ulated each month from special price reporters, a Sample list made up primarily of small town merchants, elev- ‘ator managers, and other locally well informed bus- iness men. In the average sized state, the reports tabulated vary from seldom less than 100 to 200, and even more in the larger states. The number of reports is adequate for the more important farm products and reasonably satisfactory even with the price samples having the greatest disPersion. Where the product is produced in surplus quantities, there are usually enough reports to render the average reasonably stable and satisfactory. The areas of surplus production have the heavy weighting in computing both the state and United States averages of farm prices. The regular monthly schedule to crop reporters is sent out pract- ically in duplicate to the two different lists of correspondents. The number of schedules tabulated from each list is seldom less than 100, even in the small states such as South Carolina, and for the larger stated, such as Iowa or Illinois, more than 1000 or 1200 schedules are tabulated each month from the two lists taken together. It is only with the really small states, such as Delaware, some of the New England States, and the less populous Western states, such as New Mexico, Arizona, and wyadtsg that the returns become so small as to seriously affect the reliability of condition estimates, farm wages or the December lst farm 1., v 136 -prices of crops, and the January 1st prices of livestock. In the larger 'States, the returns are sufficient for reliable'state averages of yield per acre. V It is not until we takc.up the consideration of individual farm acreage and livestock inquiries that the fluctuation of sampling be— comes an extremely difficult problem. It is physically impossible in the far western states where conditions are extremely varied to secure a , sample of adequate size for acreage or live stock. In fact, UXCGPt in the.Qorn Belt states and in some of the Southern states, the rural carrier survey does not bring in enough returns to render the ratios dependable as an indication of change. The fluctuations of sampling over~shedOW the bias that may be inherent in the returns because of the selectivity ' of the sample. The suggested remedy for this is the suggested annual census of an identical sample of township. This sample census program is already under,way in Alabama and it is hoped that it will be possible within a few years to continue with this program in all the states. It will prob— -ably be necessary to supplement the sample census with complete enumer- ations of those particular crops which are grown intensively in limited areas within a state. This will apply especially to the fruit and truck crepe, Our present methods of handling fruit_and truck crops are in- adequate from a sampling standpoint. To summarize, the sampling experience of this Division indicates that with most of the price inQuiries of the Department, the fluctuation of sampli is not a serious matter. With yield per acre data, it is a matter the must be given consideration. With individual farm inquiries of acreage and numbers of livestock, the fluctuation of sampling is.a definite limitation to our present methods, and we are suggOSting as a proposed remedy the sample census of identical~farm areas. .A sample of -from 5 to 10 percent of the townships in a state would probably be nec- essary with such extremely variable data as those of acreage and numbers of livestock. . ' While county estimates are highly desirable from the standpoint of workers within a given state, they are not practicable from a statist— ical standpoint without a very considerable increase of expenditure and time.' The more variable the sample data, the less feasible are county or district estimates. With a substantial increase in the.size of the farm price sample, district piices within the state may be feasible for the more important farm products. Grouping Several counties into reason- ably homegeneous economic districts would help in this direction. There should not be more than ten districts.in even our larger states because of the difficulty of securing a sample of adequate size With yield per acre, acreage and livestock samples. -It would even then be necessary to limit the district estimates to the more important farm products. Even assuming that the usual measures of fluct-r Checks nation in sampling are adéQuate, no amount of analysis . of the sample itself will enable one to measure the amount of bias in that sample; It is only when check data are ‘ available that any measurement of bias is possible.. Some indication of selectivity of sampling can be had from a study of thé' 137 of the distribution of the reports geographically, but it is very diff1~ cult to go much farther than this without having some reliable check data such as carlot shipments or an enumeration of the particular uni- verse to check against the average of the sample. Cotton ginnings, car— lot shipments and the census and assessors' enumerations are examples of check data. The data supplied by cron reporters on schedules are supplemented by other data based on personal travel and inSpection of growing crops by the Agricultural Statistician in each state, on special field counts, and interviews with well-informed men throughout the state, and by tele- hone and telegraph when necessary. The Agricultural Statisticians also accompany their regular reports to the Division with comments on each crop in their states,'explaining conditions at date of reporting and stating just what changes in the weather, soil conditions, plat diseases and in- sect pests, and conditions of growing crops, have occurred since the last report. . Weather Bureau reports, as well as crop reports issued by various state and private agencies, are used to supplement or confirm the in— formation obtained by the Division of Crop and Livestock Estimates directly through its own field service. For use in making the January 1 inventory estimates, and checking these and the pig survey and lamb crop indications, as well as giving a basis for estimating livestock production by states, a large amount of information is now being obtained as to livestock movements from and into each state. This information is obtained from two sources. One of these is from records of livestock markets, packing houses and similar sources giving the number of head of each species received from each state and the number of stocker and feeder animals shipped into each state. The other is from records of railroads giving the number of cars of different species of livestock forwarded from and received t each station in the state. from Which county, district and state totals are obtained. One of the most useful checks which have been developed in connection with wheat is that of records of receipts at the mills and elevators, and railroad shipments of grain. Because of the fact that with the exception of the rather constant amount usedfor seed and feed, most of the wheat crop moves into channels of trade, the Division is able to check its estimates on this crop in a number of states. watbhing how the checks work out each year, prevides tests of the various measures of sampling. In practice, the effects of the bias of the reports, the selectivity of the sampling, and the fluctuations of sampling, never can be clearly segregated. Empirical tests will there— fore continue to be relied upon in the final decisions. Egperiments with the actual data of samples. such as that of Mr. Bradford B. Smith in drawing 60 random samples out of 8000 pig survey returns, and of Mr. Sarle with different crops in Iowa, will always be the safest guide to practice. This paragraph is the most significant in this whole discussion. It warrants amplification that space does not permit. (6) SECONDARY DATA. The examination of secondary data should include tracing it through the same steps that one would go through collecting the data oneself. This means that one should find out what units and measures were used and how they were defined; and how the sampling Was done if samples were used; also study_the schedule question by question and the instructions issued in connection with it. Meth- ods of tabulation and-summarization should be looked into if possi- ble. Tables published by public officials often reveal glaring in- consistencies if studied closely. The totals sometimes represent different numbers of farms with nothing to indicate this. If data from the same source are to be used over a period, special atten- tion should be given to changes in units or methods of coiiecting or handling data. ' It often proves possible to test the adequacy of such data by checking it against data from.other sources. In some oases trial studies are warranted. We need greatly to have a considerable amount of census and other official data checked by re-surveys. iny thus Will‘ve discover its weaknesses. When access can be had to the original data of such sources, it can be tested by making frequency distributions of a number of random samples from it. The best discussions available of the uSing of secondary data are in_Crum and Patton, 15-28; Secrist, 22-46; Jerome, 506-326; Day, 589—393. Crum and Patton have a chapter on the "Compilation of Secondary Statistics." Whole chapters could be written on the nature, limitations and methods of using the data of a single source, such as those of the Bureau of Labor Statistics. An excellent study of this type is that of Charles F. Sarle on the "Reliability and Adequacy of Farm Price Data" (U.S.D.A.1480). C. R. Chambers discusses the accuracy of the land value data of the federal census in U.S.D.A. Bulletin No. 1224, "The Relation of Land Income to Land Values." We need similar studies of the data of yields and production;of prices in the wholesale markets. The most important, and the most treacherous, source of im— portant secondary data available to agricultural ec6nomists is the_ federal census. Most of the evidence points, as one would expect, to under-statement rather than over-statement. The reader should review carefully Dr. Joseph S. Davis' paper in the March 1925 "Proceedings of the American Statistical Association" and the dis- cussion following it. Dr. Holbrook Working’s revision of wheat acreage and production since 1866, in Wheat Studies of the Food Research Institute, June 1926, should be read by all, first, as .‘20 *1 ,' 'ovidence of the sort of errors that the census and crop estimate data con— tain, and second, as an example of methodology in handing such material. Some of the.principal types of errors in the census data are the following 1. Ommissions, duo in part to haste in taking the schedules. 2. Underestimates of total figures, such as sales, acroages, etc. "iest personS'err in giving totals en the side of not appreciating the size of the aggregate-~just as most people are unable to see Whore their salary goes to each month unless they stop to itemize ~and add. 5. Conservatism.in reporting anything that relates to values and in-‘ come. 4. Failure to remember very exactly the events of the past year. When the census was taken in April, the winter wheat crop re- ported had been planted over a year and a half. There is usu- ally a memory bias. For example, the cotton yields reported on their own farms by 6000 farmer crop reporters to the Bureau of Crop and Livestock Estimates in 1926 and 1927 averaged over 10 per cent higher in Larch than in the preceding December. f 5. Data from crop reporters indicate that 14 percent of the farmers‘ nwve each year, and other data from the 1925 census that twe~ thirds of these move betWCen December and March. This means that the data given for those farms, especially if the census is taken in the spring, if often given by a person who was not on this farm during the year previous. In using Wholesale price data, it is highly important to know exactly how they are obtained, and the dangers inherent in the methods. Price re- porting is still done on such an unstandardized basis that data reported by different markets may rat be comparable. The system of quoting may give a price which is not at all suited for the purpose in hand. Most available secondary data purports to be a complete count or re- cord; but when it does not, it must be examined with especial care from a sampling standpoint. X (f) PREPARING DATA FOR ANALYSIS, (1) Editing. The principal purposes of editing work are as folloWs: 1. To reduce the schedules to uniformity in the matter of arrangement so that they will be tabulated Without loss of time or error. 2. To complete the schedules by seeing that all totals and other on- tries are made, and in some cases that the missing items are sup— plied. 3. To check the data for accuracy and consistency of the various parts. 4. To discard some of the data in many cases. The proper procedure in the matter of editing depends in considerable part upon the objectives of the study, whether that of making a complete count or record, as in a census, or of presenting an accurate description, or discovering_and measuring relationships. Let us consider one at a time these four objectives of edit— ing. The best procedure in the matter of reducing schedules to necessary uniformity is for the editor to work over a censiderable number of sched— ules to discover the changes which are needed, and to prepare a definite set of instructions to be fellow0d. The following example of such a set of instructions illustrates the nature of the problem. (a) The apple trees reported as reioved during 1925 and 1926 should not appear under "Record of apple trees" on opposite_page. In case it is evident that they do, they Should be stricken out. (b) In case trees listed under (b) (trees to be removed) are not on opposite page they should be added. (0) In case the variety of trees removed or to be removed is omitted tabulate "Variety not reported" (Code 1% (d) If report states no trees removed, list as such. But if-report has no remark or note concerning the removal of trees it should . be tabulated as not reporting. (9) Under "times sprayed" edit to whole numbers for machine tabula— tion; thus, if a grower reports part of orchard sprayed four ’ CF" a- -v-- .. -.._.. times and part six times, tabulate as five times. (f) For machine tabulation, dusting will ordinarily be counted the . same as spraying. With hand tabulation, dusting may be tabulated separately if desired. . There may need to be instructions about interpretation of age reports, namos_of varieties, market classes, units of measurement. Whether or not it is proper to fill in gaps in the data de- pends largely upon the objectives of the study. If a complete count or record is desired, there is justification at times for supplying a missing figure on the basis of the average of the group or an average adjusted to the special conditions prevailing. Occasionally only some procedure of this kind will make poSsible the use of the rest of an important schedule. The federal census formerly made use of this practice extensively. All recent developments, however, have been in the direction of filling in fewer and fewer gaps. Mr Leon Truesdell in his discussion of Dr. Davis' paper on Federal Census (see 1928 Proceedings of the American Statistical Association) I.stated that he believed that the practice should be still further abandoned. The great difficulty with such a procedure is that the cases in which an item is omitted are very likely to be abnormal, as when COWS are reported with no dairy products sold, the cows cannot safely be assumed to be producing an average amount of milk per cow. If variations betWeen farms or other units or relationships, are What is particularly to be studied, the filling in of gaps by applying averages has the effect of obscuring the very thing wanted and reducing the amount of variation. In a study of costs of producing milk based on cow testing association figures, the costs of labor and several other elements were included on the basis of average inputs. The authors were surprised, to find, when they got through, that the Variations in cost between farms were much less than ordinarily supposed, and drew from this the'suggestion that probably much of the walk about wide ranges in costs is not warranted. If an accurate picture of an area or region is wanted, as in a survey study, there can be no purpose in filling in gaps becauSe do- ing so adds absolutely nothing to the picture, since, at the best, the completing is done on the basis of the averages for the rest. Editing the schedules for accuracy is largely a matter of checking titles and seeing whether the various parts of the schedule are consistent with each other. It may be a matter of checking the opening inventory plus purchases against the closing inventory plus sales. It may involve matching valuations and acroagos, or division of real estate values between land and‘buildings, or crop acres against sales receipts, or live stock against food. In feeding studies, an important kind of checking for accuracy is to compare the food consumed with feeding standards. Often- times it is well to go outside the data themselves and obtain checks at local creameries or elevators. Frequently the editing instructions should include points covering checking for accuracy as well as uniformity and missing items. . The most obvious reason for discarding a schedule is its _ ‘O_ /N 142 inaccuracy. The question is always pertinent as to how inaccurate a schedule must be before it is discarded. If.only totals and averages are to be used, it is frequently reasonable to assume that a rather considerable degree of inaccuracy will prove to be compen- ' satory. It may be better in such cases to include all records, how— OVer inaccurate, than to exercise judgment in excluding part of them, because this very exercise of judgment opens the way for bias in the editing. A better procedure is to establish some reasonable objective 'tests of degrees of accuracy, and follow those consistently. The more difficult cases are those involving discarding records for reasons other than accuracy. Oftentimcs some of them are clearly outside of the universe selected for study and are properly excluded. Even this should be done, however, only on the basis of a clearly defined statement as to what the universe is supposed to in— clude. Thus the field men making a survey of dairy farming may bring in records from a number of farms in which the cattle are of the dual purpose type and the sales of fat cattle figure very largely in the livestock receipts. Should, or should not, these be included in the analysis? It frequently happens that a number of cases are af— fected by abnormal circumstances. For example, a study of factors sf: footing the price of grain making use of a large number of sales ac— counts from records of commission firms may inelude a small number of cars of grain which hare been noted on the sales record as "musty" or "garlicky." Unless there are enough cars of those kinds to make pos- Sible a special analysis of these factors, the proper procedure is to strike out these cars. Suppose, however, a number of the records are widely out of line with the rest and no notations of any kind are made on them. There is good reason for believing that special circumstances ware involved in these cases too, and no doubt it may be permissible to discard them. Before this is done, however, a careful study should be made of the circumstances which may have produced these exceptional results. A case of somewhat different nature is that of the single farm, in a survey including perhaps a hundred farms on which the farmer realizes large losses from having lightning strike his buildings just at the time when they are not insured. t may be argued that this is one of the hazards which should be included in an analysis of the farm business and that therefore the record must not be excluded. The first answer to this is that if a sufficiently large sample had been taken so ‘ as to give an unusual case of this kind its normal frequency, it should be included, but not otherwise. A more pertinent answer is that inasmuch as it is relationships more than averages and totals that is Wanted in the study, the inclusion of these single cases furnishes no basis for isolating the effect and the cause which it represents. Reasoning on a similar basis, a half dozen farms on a lake shore might be excluded from a study of land values in an area. Then there is the question of discarding some of the cases merely because they are extremely high or low. In an arithmetical mean, these would figure very largely, and still more largely in a geometric . . \. ‘mean.~uln a mode or medium they have no more effect than the usual items. If it was possible to say that the arithmetical or the geometric mean was the scientifically accurate summary expression, than it could reasonably be argued that extreme cases should not be excluded; but since such is not the case, there is basis for the position that since they would not be in- cluded if the mode or medium.Were used, they can be dropped, whatever form of summary of expression is chosen. In correlation analysis, this question is extremely important because of its use of standard deviation which accent— uates the extreme. The difficulty of proceeding according‘to this reasoning is that too much is left to the judgment of the editor. He is left in a position where he may be accused of proving a case by selecting the neces- sary data. The only protection against the exercise of bias, consciously or unconsciously, in such a case is to lay doWn definite rules to be fol- lowed. It should be obvious from the foregoing that the person in charge of a job of editing should have good judgment as Well as a sense of scientific method. He must in addition be well informed as to the subject of the study. ‘In one of the divisions of the Bureau of Agricultural Econ- omics, the practice was followed for some time of excluding reports on small ranches showing abnormally high incomes. Later it was realized that many of these small ranches had access to a 12.rge area of public domain which they grazed without any charge. In survey studies it is highly de— sirable to have the man who had charge of the field work cooperate closely in the editing; perhaps handling it alone, although there is something to be said for having a colleague or superior give some attention to it also. Oftentimes a wise procedure is to have some preliminary analysis made of the data after they are edited, and en the basis of this do a certain amount of reeediting. A simple correlation analysis may furnish the basis for computing residuals for individual cases, which may reveal the fact that some of them are seriously out of line and lead to a closer examina- tion and checking of them. Questionnaire data are the most difficult to edit. The ten— denoy of careful workers is to exclude a very large proportion of such re- turns because of their apparent inaccuracy and the wildness of their esti- mates. This very process, however, may at times introduce a more Serious bias of one sort than it eliminates of another sort. Any large job of editing should be carefully organized, and the more obvious and glaring of the mistakes, omissions and inconsisten— cies may be detected by clerical assistants. The steps can be as follows: 1. Have the clerks bracket all doubtful answers. 2. Have the head clerk red-pencil these which in his opinion are valid. 5. Have the editor check the work of the head clerk, perhaps using a blue pencil. With the Work organized in this way, one editor can pass final judgment on many thousands of schedules in a short time. 144 (2) Checking For Accuracy. . . The following are some rules or procedures to be followed in checking for accuracy: 1. Have the clerks always initial their work. 2. Have the second clerk check the work of the first, both initialing the sheet; 5. Have all tables arranged so that they can be cross totaled; also have tWO persons independently calculate all totals and averages. 4. Keep all related shoots properly fastened together. 5. Do all checking for errors before making the summary of the totals. 6. Use different colors of ink for various classes of items; for ex— ‘ ample, red ink for totals, green ink for grand totals. 7. Watch cyphers and decimals very carefully. 8. Have all transfers or entries checked by some other person than the one who made them in the first instance. (3) Tabulatign. The whole subject of economy and efficiency in tabulation work is one which will well repay careful study. All that can be done here is to make a few suggestions which have greWn out of the experience of work- ers in the Bureau of Agricultural Economics and a few other places. In many types of studies, time Would be saved if more of the tabulating work was done either from the original sheets or from office sheets compiled from them. The plan of reducing field data to a compact, well arranged office sheet on a good quality of paper, such as described by Prof. hisner for handling farm management surveys, has stood the test of experience and has a great deal to commend it. If no correlations of any kind are involved, it is frequently good economy to work up each item independently directly from the office sheets. If correlation between two series is wanted, then these two can be Worked up directly from.the office sheets. If a considerable number of re- lationships are involved, it is advisable to get the data for these on pro- perly arranged cards so that they.can be sub—sorted. It is easy to encum— ber such cards with too much data and try to analyze too many relationships at one time. If desired, regression results and residuals can be entered on these same cards. The plan of transferring all the data to long strips has not grown in favor. The strips have proved to be too cumbersome and difficult to Work with, especially by those not entirely familiar with them. There is an increasing tendency out in the states at the \— :3.“ [.l 4;. I \. present time to use sorting and tabulating machines. It will be interest- ing to research agencies generally to know that the workers in the Bureau of Agricultural Economics are using such mechanical aids much less than they‘wero, and that the Division of Crop and Livestock Estimates has ceased / their use almost altogether. The consensus of opinion in the Bureau of Agricultural Economics on this point is as follows: It is advisable to use mechanical tabulation when the number of cases to be tabulated is large and when a considerable number of sub—sortings or cross tabulations is wanted. It is especially economical to use a machine for tabulating when data of the same type are brought in at frequent intervals so that a definite system of coding can be standardizid and learned by the clerks. An important disadvantage of machine tabulation is the large number of errors made in the punching of the cards, especially when this punching is done by clerks who are not familiar with the project. Many Who have used machines have come to the conclusion that they can be safely used only where the schedules have been very carefully edited, so that they are ,; fool-proof, and that even-then all punching must be supervised very care- fully. There is the further difficulty that errors made in the punching ~ are very difficult to locate. In studies Which lend themselves to machine methods because of their siZe or the large number of sortings, the COLL can frequently be reduced as much as one-third. The saving in time is really in the sorting. Nr. Becker says that only in studies involving considerable crossatabula- tion does machine work pay. One unit consisting of one tabulator, one sort- er, and punchers and vorifiers costs, at present retQS, $175 to $200 per nmnth (the manufacturers rent their machines) and there must be enough Work to keep the rechines fairly busy if the total expanse is to be less than if the work were all done by hand. 146 - 1g) ANALYSIS OF DATA. (1) Summarizing.* It is the purpose of the handbook in handling such statis- tical devices as averages, frequency distribution, and measures of disper- sion merely to show how they may be significantly used in agricultural economic analysis. To the young research worker, these devices too often . seem largely academic, reminding of laboratory exercises in histograms, polygons and ogives of statures, leaf measurements, etc. For frequency analysis, three sorts of uses will be consid- ered: (1) Description of a group; (2) Comparison of groups at different periods or in different places; (3) ShOh'ing relation between variables. Group variations Two of the striking uses of frequency analy- " sis for purpose of description of a group are the follmVing. l. Skew or "bulk-line" phenomena in variations in some items of cost or physical input or profits or income. 2. Behavior of receipts, sales and prices in the market place. Suppose we wish to examine the quotations of the prices of a product in a particular central market as a representation of the actual sales of the product in the market on that day. The quotations of the Bureau of Agricultural Economics for July 20, 1922 were as follows: Canner COWS $2.00 — $2.50 Cutter COWS 2.50 - 3.25 Common to choire cows 3.25 - 7.50 Table I shows that there were no actual sales between $5.50 and $7.50, which indicates that the Quotation on "common to choice" cows must have been nominal. * Professor Warren C. Waite, University of Minnesota. TABLE Io “Distribution of the Price: of Slaughter COWS Bought by Swift and Com- pany and Armour and Company at South St. Paul, on July 20, 1922.* Price per Per cent a? Hundredwaight Animals purchased $1.88 - 2.12 0.8 2.13 - 2.37 7.3 2.38 - 2.62 19.8 2.63 - 2.87 11.2 2.88 - 3.12 15.5 3.13 - 3.37 11.2 3.38 - 3.62 9.1 3.63 — 3.87 8.1 3.88 - 4.12 4.9 4.13 - 4.37 0.5 4.38 - 4.62 5.2. 4.63 - 4.87 1.0 4.88 - 5.12 ~V4.2 5.13 - 5.37 0.2 5.38 — 5.62 1.0 * Gaumnitz E. W., Central Market Price Quoting, Ph. D. Thesis at Minnesota, 1925, p. 104. . ..._.. - -1.--- ” TABLE II. ‘ A Random Sample of 5,934 Hogs sold at South St. Paul on June 1, 1926.* Price per Per cent of HundrodWeight Animals purchased $7.38 — 7.62 3.88 7.63 — 7.87 3.62 7.88 - 8.12 4.16 8.13 — 8.37 11.85 8.38 - 8.62 9.43 8.63 — 8.87 11.19 8.88 - 9.12 48.70 9.15 - 9.37 7.15 9.38 - 9.62 0.60 9.63 — 9.87 0.50 9.88 -10.12 3.10 *‘UnpuBlished data in files of’Division of Agricultural Economics at the University of Minnesota. Table II gives a similar distribution for the prices of hogs on the same market on June 1, 1927. The difference in the nature of the two distributions is striking. The quotation "average cost to packers" was $8.60 on this day. ' One wishing to know what actually happened on the marketéfin 148 a given day in the matter of prices can learn a good deal by making a fre- quency distribution of the sales, and still more if he has some sort of a normal for that time of year and day of the Week to serve as a criterion. If the buyers have acted abnormally, it is pretty certain to show in the distribution of prices. The art of quoting prices is largely based on the science of frequency distribution. The job 0 the price quoter is to find a figure which represents what a normal array of the commodity, or of a -given class or grade of the commodity, would have sold for on the market that day. He has to determine this in many cases out of an imperfect distribution in the first place, and almost always out of an insignificant samplo'even of what sales are made. Those.who use market quotations need to know-the proce— dure that was followed in making them. Some day this procedure will be re- duced to a more nearly.soientific basis. Scientific grading-awaits such developments. ' ~ - The shape of an individual frequency distribution in itself often meansI little. But as time goes on, we shall build up an understand- ing of the "normal" for various phenomena. Stability.cf- frequency dis- »tribution of samples it a safe guide in any case. Such use of frequency analysis is especially liable to the errors inherent in choice of clasinntervals. IThe essential principles determining choice of class interval are covered in the textbooks.* An interesting example of a cumulative frequency distribu—' tion is given in Figure 5 of the U. S. .Tariff Commission‘s final summary oft the cost of producing sugar beets. The frequency analysis Imay also be used to indicate cases in which a further classification of our data may be useful in an analysis. According to the Le Place - Charlier hypothesis, any frequency curve may be thought of as a sum of independent frequency curvesnl‘”I A study of Sales of hogs in the St. Paul. market on June 50,1927 gave an example of a dis— tinctly bi-modal distribution. hhen broken up into its. proper.parts, this distribution gave a classification by weights corresponding to. that used by the Bureau of Agricultural Economics in quoting hog prices-. Comparis cons There are many economic problems.which are 0 Frequences. . primarily concerned with differences in the dis— _.____....____ . tribution of a particular variable.r These dif— ferences are sometimes related to- geographical differences in sizes of farms in different sec- tions of a state, or the amount of net rent of farms absorbed by taxes in various states, or wages of farm labor, or farmerIs incomes, or intensity * See especially-—Yule, G. U., Theory of StatiStics, p. 79—83. Day, E. E., Statistical Analysis p. 65—71. Mills, P‘. 0., Statistical lethods, p. 68-73. ** Fisher, A., Mathematical‘ Theory of Probability, p. 176. l .r -~ w of cultur€, Another kind of study deals chiefly With diffs standpoint of class or kind. ‘ different nationalities on farm living. I" O “I no... .. renees-fronrthe Thus we may wish to compare the variation of incomes of tenants with those of owner-farmers, or the expenditures of Finally, there will be-many cases where the significant thing to be analyzed is the difference in the varia— tion at different periods of time. We may, for example, wish to analyze the changes in the distribution of income over a period of time, or sizes of farms, or certain aSpects of market behavior, progress of settlers, changes in marketing costs, and so on. Frequency distributions offer a method of attacking such problems. A separate frequency can be prepared for each geographical area, claSs or type, or period of time and these compared asva series of frequency distributions. The comparison is facilitated if the class intervals are the same in all frequencies, and whereever possible these intervals should be chosen at the start of the problem with such comparison in view. Whenever the frequency distributions differ materially in size and shape, and there are a great number to be compared, it will be ad- visable to compute certain coefficients descriptive of them and compare these, such as these of control tendency, dispersion, skewness or kurtosis. Another useful method is to express the data in percentage of cases falling in each class, as in Table III. TABLE III. ~ Percentage of Farm Operators Falling in Certain Age Groups by Geographic Divisions. ' ‘hs. : Under :25-34 :55-44 :45—54 :55-64 :25 yrs“ 1.69 New England : Middle Atlantic : ‘2.52 East North Central : 3.55 West North Central : ~4.87 south Atlantic ; ; 7.26 East South Central : 8.93 West South Central : 8.98 Mountain - 3.95 Pacific 2.67 rycars :years :ycars : 11036: : 14.89: : 19.73: : 24.29: 2 19.90: : 21.54: : 23.47 : 24.16 17.01: 20.72 E 23.05 25.81 : 25.61 24.90 : 26.23 : 24.69 : 24.09-: x 25.40 : : 28.00 : 26.16 : 24.32 22.59 23.26 22.51 22.22 22.83 25.52 :65 years igyoars :& over : 22.19 : 18.22‘ , E 20.42 : 13.51 ‘/* : 18.01 : ~9.49 4 ' " : 14.87 : 7.14' , .1]. :‘15-09 : 9.80 ' ‘ ,— ““ : 13.77 : 9.16 ‘ w ” : 12.81 : 7.12 : 14.61 : 6.45 ' 19.05 : 9.59 \_ ‘ .1Teble IV 13 “ °°mparifiop~Ih9re~di£1erence in frequency and overlapping of frequencies isximPOrtant. Theleifect of dressing per- ' centage upon the grade of the animal is of course-evident.‘ The extent of overlapping indicates the existence of other factors than dressing percentage in determining the grade of the animal. or that there was a shift in the kind of animals classified in the particular grades during the period. There_is also a difference in the shape of the frequency distribution of each grade, which is importent for certain aspects of the problem.- TABEE IV . -._ '- X - Cows Bought by Swift and company and Armour and Company at Sonth St. Paul during the Period July 13 to September 15, 1922, Classified According to Grades and Dressing Percentages.‘ Dressing Percentages 48: 49} 50: 51: 52: 53554z55zNoAn~ Grades S 39: 4O : 41 : 42 : 43 : 44 : 45 : 46 : 47 : : to: to : to ; to : to :_to_: to : to : to.: to: to: to: to: to:.to:to:to:imal : 40: 41 : 42 : 42 : 44 : 45 : 46 : 47 :‘48 : 49: 50: 51: 52: 53: 54:55:up:g Banner §'97§1746§3321§2907§ 755; 340: 76: 97: 7; ;- g 45; Q Q 2 Q9409 Gutter :108; :8; 189;.793;1656£2258:2483;1674; 520; 74; 42; 41; ; 43; 8; I; g9867 30mmon ; g 16; g g 37; 227; 390;1070;1818;991;579;135;126g ; 3; ; ;4892 Jedium § E Q E Q 6; i 4; 44; 24i124§255§213§143§ 55; 25; Q12; 915 Eood Q E § § Q i i .i 1: i i i g: 17; : § § 3; 20 Jhoice ; Q g i i i i i i i : i -: : ~ ; g «t- i ' Data obtained from records of Swift and Company and Armour Company. (I of western University has developed some interestingflgnnnlupions out of'suoh an,ana1ysis of retail stores. Comparison of the. in a frequency distribution from your to throws light on the theory "representative _fiarms” of Alfred mar Correlation betWeen Variables ”151 n of individual farnsgnasmwvnwntg Professor Socrist of North- \ Problems which involve an httgrqd to relate the variation in a particular variable to the varia- tion in other factors are usually handled by a , correlation analysis. A frequency analysis, how- ever, is often advisable as a preliminary step to this or may serve as a substitute in problems which are relatively simple or do not contain enough observations to warrant the use of correlation methods. The method of approach to such problems is by the prepara- tion of a double frequency or contingency table such as in Table V. This table indicates that in- general the smaller farms have more cows relative to their crop area than the larger farms. TABLE V. Dairy farms classified according to the number of crop acres - and the number of cows kept. * Number of: Number of F arms :All cows :16 t0.131- to: 46 to: 61 to: 76 to: 91 to: 106 to: 121 to: 156to: 151 to: 166to: sizes 30 z 45 :v 60 : 75 90 : 105 : 120 : 155 : 150 : 165 : 180 : :acresfaeroszacroszacres:acres:acrosfacros :acros :acres:acros :acros: :3 5 orlnss: l : 2' --—- :---- :-----:---- :—--- :—--— :---- :---- :---L : 5 6 to 10 : 12 : 32 20 . 7 : 1 :--—- : 1 :---— :---— :——-- :—--- : 73 11 to 15‘: 3 : 19 25 : 21 10 : 4 : --------- ———- —--— --——— : 82 16 to 20 :-—-- 6 23 : 16 . 16 s 3 z 1 ;-_-_ ;____ ;--__ : 71 21 to 25 :---— 2 5 ° 3 : 7 6 : 1 : 1 : 1 :-——- :-—-- z 26 26 to 30 :—-—— -—-- 2 1 3 4 , 2 :—--- ' l :—-—- :---— :---- : 10 31 to 35 :—-—- ----- ---- :---- :—--- :---- : l :-—-- :---- : 2 :—-—— : 3 36 to 40 :—--- ---- '—--~ :-—-- ------------------- __-- --_- 1 : 1 41 to 45 :---— ---— ---— ._--- :---— :~—-- ;---- ;_--_ ;-__- :____ :____ :___ 46 to 50 :—--- ---- 1 --—- ---- ---- :—-—- :---- :--—- :---- :-—-— : 1 Total : 16 61 76 : 48 : 38 : 18 : 6 : 3 : l : 2 : l :270 L The analysis may sometimes be concerned not alone with the association of the tWO attributes of the variable, but with some result of this * U. S. D. A. Bulletin 1400 — Factors Affecting Farmers Earnings in South— eastern Pennsylvania, p. 10. ~~‘.\ 152 association, as in Table VI.n Here we are interest-ed primarily in the pro— fits of stores, which are a result. .of the relation between total oxpthos and gross margin.- The class whore Ithoso .are approximately the same is“ boxed by heavy. lines.. Sections to the right and above indicate 1ossos, '” sections below and to the left indicate profits. TABLE VI. Total Esponses and Gross Margins of Ono-ham meat Stores in . Chicago, Cleveland and Nevr‘York, wk Gross Margin : I Total Expense — Per cent of Sales Per cent : 14‘; 16 : 18‘: 20': 22 5'24 : 26 : 28 : 30 : 32 3 of Sales : tn?: to ; to : to t to = to : to : to : to‘: and : Total . : 16 :-16 : 20 : 22 2 24 c 26 :‘28 z 30 1 52 : over 6 Under 14 :-—--: 2 : 1 : 4 : 2 g 3 : .2.: .1 z 2 ;:-,3- :w 20 14 to 16 z 1 :-——-: 5 : 2 : : z : l t 1 1 : 8 16 to 18 : :—-—-:—-1—: 1 : :. t 1 : : 1 2 4 18 to 20 : 1 : 1 :----:——4-: : z : 2 : 1 : 1 10 20 to 22 : " l . . 2 : 1 : l : 3 : 2 : V -: l 11': 22 to 24 1 : = .. :~-2—_:--3-: 3 : 2 : 1,: 2 14 24 to 26 2 : l 2 7 : 2 :--2~: -~4—: 1 : : l : 20 26 to 28 : : 1 : 1 ' 5 : 5 : 2 :—-1-:-—5-: l 21:."~ 19 m 28 to 30 ': '1"' = : 2" ‘4': 2m: 2 E—-2-: ——3—: 111 ”E 16 30 to 32 : : : 2 : z 4 : l ' 3 z 2 :--1-:---1~—: 14 32 a over : : : 4 : 6 z .4 : 4-:.15': : 1 :-g-5—-: 27‘ Total 2 : 2.: s g 13.: 55': 24~z 18 a 22 :;1s : 12 : 15: : 163~ * Bureau of Business Research, Northwestern bn1v0r31ty, :Serics III, No. 9, "Expenses, Profits and Losses in Retail Meat Stores," p. 65. _ The analysis may be carried further without the use of cord relation by the calculation of certain coefficients of. association and con- tingency. The analysis with respect to these methods has received treat—- ment in the section on "Association." Averages 1 . Since an average is a-summary or characterization .3. _ of data by a_singlo figure, it.is not useful Where ' the,significant thing is the-variation of the data rather than .a s.ingle type characterization. Thus an average cost of the production of wheat is quite meaninrless if we have in mind the use of that average to guarantee the farm- er a certain profit above cost, or propose to set a tariff upon the basis of it. The significant thing here is the difference in costs of the various producers which precludes the establishment by tariffs or otherwise of a ” single price that will guarantee a profit to all. There are many similar situations. - - Averages fall into two groups; thoSO Wh1Ch depend for their ” value on the magnitude of.each item of the series of calculated averagesl. $. ‘6 \ _-- ‘- “v/ such as the arithmetic mean; and vahler‘fliich_,dopnndm‘tho~mmber or position.ef certain items, such as the mode or median. Tho former, since they depend upon the value of all the items of the series are subject to algebraical manipulatien.' They can be combined with similar averages from other groups to give the same average for the combined group as Would be obtained from a similar average of the original items. Albegraic manipulation is an important attribute of an average; in fact, Yule considers it the most important single attribute.* The arithmetic mean is the most widely used of all the averages. It is rigidly defined, based upon all the observations, rea- sonably easy of calculation, lends itself readily to algebraical treat- ment, and is intelligible to the layran. All things considered, it is probably the most useful average, but as is pointed out later there are many circumstances in which its use is not justified. Examples of the arithmetic mean are no numerous that no specific cases need be given. It is used almost exclusively in the census, to summarize surveys, and in the yearbook.*# ‘ - The geometric mean is the n22 root of the product of the numbers comprising a series, the n signifying the number of items com— prising the series. It is also 1353 natural number corresponding to the arithmetic mean of the logarithms of the items comprising the series. It is thus a form of an arithmetic mean. Similarly to the arithmetic mean, it can be determined if we knew the total product of the items and their number without knowing any of the individual items which com- prise the series. Just as the sun of the deviations from the arithme- tic mean is zero, so the sum of the logarithmic deviations from the log- arithm of the geometric mean is zero. This means, in terms of natural numbers, that the sum of the ratios to the geometric mean of the values under it is equal to the sun.ef the ratios to it of the values above the geometric mean. In other words, we have a mean in which an item fifty per cent of the size of the mean receives equal importance in the determination of that mean with an iten.twice the size of the mean. These properties of the geometric mean make it an impor— tant average for rany sorts of summarizatien. One type of these prob- lems is where We are averaging rates of change and wish to give equal Weight to equal ratios of change. The simplest example of such a prob- lem is the determination of the average rate of growth of a population. Suppose that the agricultural population of a country has doubled during a cried of ten years, probably at irregular rates. The geometric mean ’0 2 1.0718, gives us the rate at which the population would have had to increase regularly to produce the result which we find in the tenth year. Thus 7.18 per cent is the average annual rate of increase. We are able * Yule, G. U., Introduction to the Theory of Statistics, p. 108 fi'A comprehensive discussion of the adequacy of the arithmetic mean, particularly Weighted for farm-price summarization, will be found in U. S. D. A. Bulletin 1480, Reliability and Adequacy of Farm—Price Data. 154 from this result to estimate the population for any periodni‘ Many problems of this sort are complicated. For example, Sewall Wright in his study of the relations of corn and hogs found it necessary to cor- rect the estimates of corn acreage of the Bureau of Crop Estimates from 1870 to 1889, because of the breaks in these estimates when ad- justments were made to the census figures. The acreages in the inter- vening years ware estimated by swinging the fluctuations shown by the Bure.u of Crop istimates botWeen the census figures, by multiplying the rte io of estimates for successive years by a constant factor to make the tenth year agree with the next census. ** The second general use of the geometric mean is in aver— aging ratios or relatives when these ratios or relatives express vari- ations, that is, increases and decreases. This is the problem with which we deal in index numbers. Here the property of the geometric mean of giving equal weight to observations, say of: seventy-five per cent of the mean and one hundred thirty-three -per cent of it, becomes of great significance. This use of geometri‘c means will be developed in the section on index numbers. The mode represents the most probable single value of the variable, and there is usually a clustering of a considerable pro- portion of the other values around it.*** It represents in consequence a very important characterization of data. Its value is unaffected by * The population at the end of any particular year in the period may be may be determined by the following formula: P (1 + r)“; Where P I: population at the beginning of the period, r: the geometric mean of the rates of increase, n: the year under consideration and P1 3; population at the end of the year for Which the estimate is made. ** The formula for any given year was as follows: El E.C [o E :1 2» 10E where xsthe year for which the value was to be calculated Cl, C2: The two census figures between which the crop estimates were to be swung. E1, E2 : The original estimates for the same years. *** In some cases, two or more points of concentration may appear. These may be due either to the small number of cases, to inclusion of data properly belonging to separatn frequencies in a single fre- quency, or in rare cases to underlying causes even where the data properly fall in the same frequency analysis. unusual observations. The mode is therefore the proper average to use when We are interested in the typical size of the individual observa- tion. In a study of price quotations we might, for example, compare . the bulk-ef-sales quotation with the modal sale. Ifl this sale lay out- side the bulk-of-sales quotation, as is sometimes the case, we would conclude that the latter was not a satisfactory representation of the true sales. We would also be interested in the typical or medal in comparing the incomes of classes of farmers and city folk. In the same way, comparison of the size of farms in different states when We are . thinking of typical farmers should be made by modes. The mode is the average to use when We have in mind individual cases, the arithmetic average When totals or aggregates are the significant thing. The harmonic mean of a series of quantities is the re?” ciprooal of the arithmetic mean of their reciprocals.* The harmonic mean finds use only in a restricted field, but in these uses it has a distinct advantage., it is used in averaging prices in cases where -they are expressed‘as "se‘many per dollar" and the average number‘puré chased per dollar is reguired. The average number of a group of come- modities purchased per dollar‘ebvieusly depends upon the quantity.ef each purchased at-these different rates.~ Hence harmonic means may-need to be weighted.** ‘ The median is the middlemost or central value of.a series when the values are'ranged in orderlof magnitude. One of its'principal uses is in-cases where the items effa series.are not capable of quanti- tative measurement, but can nevertheless be ranked. Thus in a study of the attitudes of farmers tOWard a particular thought or object, We might be able to rank'those attitudes without definite measurement and discern a median attitude; The second principal uSo_of the median is in cases' where We Wish-definitely to avoid influences of unusually large or small items. Percens has recommended the use of the median in his method of link relatives for determining seasonal variation.*** The Harvard Business SchOOl group has also used the median extensively in studies of the costs of doing business in various types of establishments and'lines of business.****-' I - . ' . When the distribution becomes asymmetrical, the arithme- tic mean, the median and the mode do not coincide. The mode lies far-: thest in the directiOn of asymmetry and the median betWOcn it and the~ arithmetic mean.- If the distribution is fairly'normal, the arithmetic average appears'te be the proper average for summary._ It has approxi- mately'the same value as the median and mode and approximates their _ characterization of the series, and has the advantage over them that it ., ; _. .1 . * _l_l_..._-1 2(1)_ {(i) " **.An example of the use'of the harmonic.moan is found in Minnesota Tech— -. nieal Bulletin No. 10, "Factors Determining the Price of Potatoes." . *** Cf. Reitz, Handboek‘of Mathematical Statistics, p. 151—158. , *ffr A_similar usegin agriculture may be found in U. S. D.‘A ;*Depfi£f'”" Technical Bulletin 13, Cotton-Gin Operation in North-Central‘Texas, 'po'13. ' ’ " ‘ I . » 1 . v 156 \’ can be built upon. If the distribution is not ndtgal, as is usullly the case with the economic and social data, then S no other average may prove more useful. If the analysis is to step with the particular dis- tribution and not be built upon this, the choice will usually be the mode. When the distribution is skewed toward the small values, and it is desired to proceed beyond the distribution in the analysis, the geo- ‘metric mean will probably be chosen. This ehoifip of course Will not be made when the mode lies to the right of the arithmetic mean. Greater care is necessary in the choice of an average to characterize an asym- metrical distribution than one which is symmetrical. - 'Dispersien It is advantageous in many cases to have a more - complete description of the data of an analysis than is previdod'by averages of series. This description may be provided by certain meas— ures of variability. These may be used either to give a more complete description of a partieular'serios than is prob vided by its averages, or to provide a means of comparing the variation in tw0 or more distinct series. The latter problem, of course, demands the more generalized measures. . _ Certain types of problems depend principally upon the question.9f whether one series sdeS a greater degree of variation than anotheruquor example, We may Wish to know whether there is a greater variation in the size of farms er in the income from those farms. Again, we may be interested in the relative amount of fluctuation in two series as determiners of a result which we find in a third series. For exam- ple, we may wish to know whether acreage or yields fluctuate more in order to :udgo their comparative effect upon crop production which is a product of the two. Variations are also significant in a wide range of problems involving sampling. The problem involved in the determination of the varia- bility of a time series is quite different from that involved in the comparison of the variability of a series of geographical data or phy- sical measurements. In the latter, we are ordinarily concerned with a comparison of the absolute values of the variable. In tho_former, our interest usually lies in the variation of one observation from the one immediately preceding it, or bethen groups of related observations. In a time series analysis we do fininthh variation of data adjusted for changes in the price level. This adjustment may take the form of de— partures from some trend line for the data. ' ' There are two principal ways in common use of measuring the dispersion of a series.* The first type consists of some measure of the difference of certain counted items. Here we have the range, the interquartile range, interperoentile ranges, and so on. The second type consists of some measure of the difference between a certain aver- age and each of the actual items comprising the series. The average * Fitted frequency curves Would give a complete description. ‘Tho math- ematical technique necessary for their fitting, however, prevents any wide use. M. NH. ,._,~ «‘ _,.._.. _ .- .. .— >- ...\ , —-'- ‘ r - w \ \. lax ‘ _ _ /. ’*\\ -dov1atien and the'standard deviation are,examp1o5,c£/:his type. §ini—~ lar m0th0d5~0r0 f0110WQd11n -doterndning_measnros for the comparison be- tween series, except that~the_lmb§ar must.bngednood to a comparable basis. .>~H.‘,_. .— *“\—.. “Kai - ..{ \ The simplest measure“ the first type is the range of the series. Yule says of the range:*» ~hqsi‘mplest possible measure of the dispersion of a series of values of a variable is tho-actual range, i. 0-, the difference between the greatest and least values observed. While this is frequently quoted, it is as a rule the worst of all pessi— blo measures for any serious purpose." It gives us no idea of the form of the distribution between these extreme values and is very erratic, depending as it does on infrequent values at the ends of series. The range gives an idea of how wide the variation may be, but no indication of what may be considered a typical variation. It can be used to ad- ;yantage principally for such things as the range in costs of marketing ' operations ** the range of inputs in farm operations, *** and the extent of fluctuations in prices.**** ~ The more refined measures of this type, the inter—quar- tile and inter—percentile ranges, reflect the same Weakness except in lesser degree. For example, the value of the quartile distribution is the same regardless of Whether the items are uniformly distributed be- ' tWeen their quartiles or are concentrated principally at one end of that range. The principal measures of dispersion which depend upon the relation of each individual item to some mean are the average or mean deviation and the standard deviation of the series. The mean de- viation of a series is the arithmetic mean of the deviations in the val- ues of the individual items of the series from the roan of that series. The deviations are summed without regard to sign. Mills, in his recent book "The Behavior of P1ices," has used the average deviation as a meas- ure of geographic variation in prices, ***** and modifications of it as measures of the monthly****** and yeardto-year variability of prices.******* Bewloy has also used a modification of the mean devia— tion to show variations in prices.****$*** * Yule, p. 153. ** Minnesota Bulletin 224 — Management Problems of Farmers' Elevators, p. 11. *** U. S. D. A . Bulletin 1271 - Farm Organization in Southwestern Minn— esota, p. 16. **** Boyle, J. E. — Speculation and the Chicago Board of Trade, p. 122. ***** page 162. ****** page.39. The mean deviation as a percent of the mean. ******* page 49. The mean deviation of the link relatives of the an- nual average prices. ******** Average of the percentage deviations of individual relatives from their geometric mean— "London and Cambridge Economic Service." Special Memorandum, No. 5, February 1924. i} 158 The other principal measure computed in terms of the va- riat ion of each item from a central value is the standard deviation.' Tho standard deviation is the square root of the arithmetic mean of all deviations, the deviations being measured from the arithmetic mean of the observations. The arithmetic mean is chosen since the sum of the squared deviations from it are less than for any other value of the va- riable. The standard deviation is probably the most widely used measure of variation. Yule says it, "should all-rays be used .as the measure of dispersion, unless there is some very definite reason for preferring another. measure."* In this respect it is much like the arithmetic _ mean. The great value of the standard "deviation lies in its subsequent importance in the statistical analysis, .for example, in the problem of sampling and of correlation. The- standard deviation possesses the ad- vantages of rigid definition,- ready algebraical treatment, dependence on all the values of the series, and', in general, of being less affeeted .~;.-y the fluctuations of sampling than the other measures of dispersion wl ich We have described. The common measures for comparing the variation of two series are ordinarily constructed on the assumption that there is a re- lation between the size of the items comprising the series and the ab- solute variatienamong the items of the series. This might be accom- plished quite directly by computing the goorxetric means of the series and measuring the deviations by their ratios to the geometric means. This, however, is not often done. The simpler method is usually to correct the measures for size by dividing the measures of absolute vari— ation by some mean. The standard deviation divided by the arithmetic mean gives what is called "the coefficient of variation," the most , widely used comparative measure of variation. It must, however, be in- terpreted carefully in mny economic series, especially when the series whose variations it measures contain positive and negative items and the means of the series are in consequence srraller. This night, for example, occur in considering. the labor incomes of two groups of farms. * Yule, p. 144. e.“ ‘u (2) The Measurement of Time Series;* Movements. ._.._.—~ The distinguishing characteristic of time series is that they consist of observations with regard to time. Each item in such a series is the outcome of the continuing operation of the same forces which caused the preceding ones. It may be partly the outcome of influences set in operation by earlier items in the same series. , At each moment these forces, opera ting in their economic and physical environment, result in a temporary equilibrium of economic situation. But. this is only a transitory phase or a crossaseCtion of the contin— uous sequence of events. It centains within itself elements of growth or development which may be expected to bring about its own modifica— tion, or for that matter successive modifications of the future flow of events. Our analysis of soonomic time series is aimed at the cam— prehension of this dynamic manner in which econOmic fordes act in the unrolling of our drdinary sequences of events; Our recorded items re- present simple measurements at regular intervals of a series of cross— Sections of the flow of phenomena. We seem unable to grasp or compre— hend this dynamic flow of the forces themselves. We are able to per» eeive them only by their manifestations in events. This limits our analysis to a careful examination of the recorded phenomena, from which we may be able to draw some inferences as to the nature of the under— lying forces and the regularity and strong th with which they act. ' f The fluctuations to which any series is sub-- Tynes of i * just are the result of a multitude of influences, movements* . having effects differing in direction, intensity and duration in relation to a given series, and ‘ varying in their relative effects between one ser— ies and another. Moreover, it is impossible to express quantitative- ly all the factors known to affect a given series. Fortunately, it is sufficient for most purposes if the effects of only those factors which exert a uniform, regular, or regularly recurrent influence on the magnitude of the items in the series can be expressed. * The material in this section was prepared by the following: 'John A. Hopkins Jr., Iowa State College; 0. F. Clayton. Michigan State College; C. M. Purves, U. S. Bureau of Agricultural Economics; V. R. Wertz, Uni— versity of Ohio; H. B. Killough, .Brown University; B. A. Holt, Univer— SitY'0f_Minnesota. The committee on the handbook fitted the contribu» tions of these six into one article, thus eliminating overlappings. Specific credit will be_given for all the major divisions, but it will not be possible tonnention the author of each separate paragraph. **The classification here presented is that of Professor Day. This section was mostly contributed by Hopkins and Clayton. ’.'-" ¢ . ‘ '_ .. _ 7. igcrc r-—_.1eo “ I Time series are subject to threefmain types of influences whose effects are relatively regular and continuous: (1) The magnitude of the items of a time-series is affected by certain underlying, long—time influences, tending to produce in the series a persistent upward or downward movement extending over a re— latively long period of time. Professor Wesley c. Mitchell suggests the following classification of those long-time influences: "(1) Caus— es related to changes in the number of population, (2). causes related to the economic efficiency of the population-its age, health, educat— ion, technical knowledge and equipment, methods of cecperation, meth— ods of settling conflicts of interest, and many other matters, (3) causes related to the quantity and quality of the natural resources exploited by the population. "* . « ~ The second sort of regular movements are the cyclical or re~ current. These occur with a sort of rhythm and more or less period— 161tY. The term cyclical has been applied to a large number of dif— forent sorts of va.riations, but more careful examination finds it dif~ ficult to demonstrate the true cyclical nature of any very large pro— portion of business_fluctuations.' Attempts have been made, but as yet with but little success, to demonstrate the regularly rhythmic influe— nces capable of causing economic. cycles in the strict sense. Corre— lations have been sought between economic fluctuat-ions and cycles of solar radiation or sun spots. Cyclical f1u.ctuations in the mass psychology of the business man have been suggested but not demonstrat~ ed. ' The length of time required to make. adjustments in the product— ion of crops or livestock would appear to furnish a sound reason for a period of rhythmic fluctuations in the production and prices of some farm products. But we would-hardly expect such fluctuations to con~ tinue to show as great an amplitude from cycle to cycle. That is, we might expect the resulting adjustments in production to approximate the normal more closely each time, so that the cycles~arising from this cause would become of less and less ampl-itude, unless stimulat— ed by new forces from outside the industry cohcerned. This does not mean to deny the existence of more or less re— gularly recurrent fluctuations in price or production data, but to raise the question whether most fluctuations that‘ have been called cyclical may not be of the unique. or episodic type mentioned later. Of those which do seem regularly recurrent, only a few types. suCh as the seasonal, have been satisfactorily explained. The third type of fluctuation. is the seasonal. This is in fact a type of the regularly cyclical sort. In the study of -agri—_ cultural series, the seasonal fluctuations are of great importance. The recurrence of the growing season is the outstand-ing character- istic of agricultural business, and the supplies of farm produce by their conformation to it give us a typical yearhtoayear cycle in prices also. Here we hays a clear case of a recurrent or rhythmic cause of fluctuation which results in oscillations from maxima to minima with regular periodicity. " * Mitchell, Wesley 6., Wes, 5% The distinction between "cycle" and " periodicity" becomes important here. This distinction was stated by Dr. F. E. Clements at "A Conference on Cycles" held in 1922, as follows: "It seems de- sirable to use cycles as the inclusive term for all recurrences that lend themselves to measurements, and period or periodicity for those with a definite time interval, recognizing however, that there is no fixed line between the two." (Cited by Mitchell, Wesley 0., in Bus— iness Cycles, The Problem and Its Setting, p. 377.) . g It is well to recognize that the distinction between cyclical -and secular is not absolute. What may seem like a secular trend may prove in fact to be one movement in a long cycle. _ Of the non—cyclical fluctuations, some may be called episodic _(1). The disturbances in prices or other data resulting from a war . Lre of this type. Under the same heading may be grouped the abberat- ions from the introduction of new methods of production in an industry, or the opening up of a new producing territory, or the uncovering of a new and different sort of a demand for a product. Episodic movements are characterized by a deviation from the otherwise normal trends which is likely to grow more pronounced as the new influence gathers force. If the new force persists, the old trends may reappear, perhaps some* what modified, or perhaps much the same as before, but at a different level from what might have been expected from a projection of the or— iginal lines of trend. This is illustrated by the disturbances in the prices of cattle with the opening up of the new range country between 1875 and 1885, and again by the change in trend shortly after 1902 as improved methods of feeding reduced the costs of beef production and caused a lowering of level of the-seasonal and secular trends. If the disturbing influence is of a temporary nature, the old trends may again appear at essentially the plane of the projected trend lines. Thus an outbreak of hog cholera would be expected to cause a decided deviation from the usual levels of price and.production, but after it was past the old trends would be expected to reappear essent— ially as before. The outbreak of war is of the same general nature, except that a disturbance of such magnitude as a maJor war would be ex— pected to stimulate the adoption of new methods of production result- ing in a permanent displacement of the trends as mentioned in .the last paragraph. The chief characteristics of the episodic fluctuations are that they are generally unpredicted and unrecurrent. Each case is to _ be regarded as unique and requires a different analytical approach. They vary widely both in duration and in amplitude and range, from the disturbance occasioned by a railroad strike lasting a few weeks, to that caused by a major war lasting several yearsq Nevertheless episodic fluctuations may sometimes occur with an appearance of a rough sort of regularity. This may be illustrated by the fluctuations in the prices of cattle since 1870 in which there (1) See Day, E. E., Statistical Analysis, New York, 1925, pp. 302-306. s,‘ \s 162 have been at least four outstanding episodic disturbances which occurr- ed at intervals of from 7 to 17 years if measured from peak to peak. 01‘ from 13 to 18 years if measured from trough to trough*. In each ep- . isode a different and nonrocurrent influence was to be found as the has» 10 cause. Unless a careful study of each episode is made. such a series of fluctuations will be mistaken_for a genuinely cylical or regularly recurrent type of fluctuation. ‘ There is another type of non—cylical fluctuation which may be called the'irregular or fortuitous. This comprises the fluctuations that cannot be exPIained by continuous operation of underlying forces a; in the case of the secular trend, nor by regularly recurrent rhythms e: in the-case of cyclical or seasonal trends, nor by the intrusion of powerful new and unusual forces as in the case of the episodic move- ments. Even when all these movements are explained, the ordinary series of economic data will show some residual fluctuations for which no re- gular causes are apparent. Many of the so-calléd irregular fluctuations are usually discovered, on more thorough analysis. to constitute minor epiSodic movements. That is, the more careful and thorough the analy- sis, the fewer unexplained variations are likely to remain. However, there are always some for which no reasons can be assigned. These may - represent erratic market information, or erratic reaction on the part of sellers or buyers of produce to information which may be essentially 2 correct. Or they may represent chance combinations or forces which give the market a temporary and unexnected turn. Secular trend has been defined as."the long- Secular time tendency of the items of the series to grow ' Trend** or decline regardless of the temporary seasonal and cylical and irregular movements**¥9 .This . definition emphasizes the fundamental or organic character of the changes represented by the trend and the “independ— ence of these changes from those of a seasonal, cylical, and spor- adic character. If, however. the seasonal movement is causally re» lated to the cyclical movement and the latter to the trend, then the empirically.determined magnitudes representing the trend cannot be said to represent the changes which the series would undergo regard- *VSee Iowa Agri. Exp. Sta Research Bulletin 101.! A Statistical study of the Prices and Production of Beef Cattle. pp. 349-357. **Contrihuted mostly by Hopkins and Clayton. ~¥**Falkner, Helen D.. "The.Measurement of Seasonal variation VOILXIX", June; 1924, p. 167. Where not otherwise stated, the references in this section are to the Journal of the American Statistical Assn. less of the other types of f1uctuation.* The trend, in fact is merely an empirical representation of‘a particular type of change characteristic of the series over a period of time. Neither the concept of trend nor the me- thods of its determination warrants the assumption that the magnitudes re— presenting the trend are independent of seasonal and cyclical influences. The statistical procedure involved in the isolation of the trend and the seasonal and cyclical fluctuations of a series, serves to exhibit the na— ture or character of the changes, but in itself tells nothing as to the relationship between the changes themselves or between or among their un- derlying causes, The search for causal relationships lies beyond the pro- cedure for determining and isolating the characteristic fluctuations of the series. Perhaps by analogy it might be said that the_prob1ems are primari— ly those of anatomy rather than those of physiology, that the relationships established are structural rather than-functional in Character. Logically, the problem of the secular trend is the problem of determining a_value which most nearly represents the "normal" condition for the data at each unit of time throughout the series. This normal may not be the average trend, as normal means that condition which tends to prevail over a long period of time, while an average condition is the result obtain— ed from combining a number of actual values. ' - ' The methods used in determining trend.may be described under three heads: (1) free—hand curves, (2) moving averages or medians, and (3) mathematical lines or curves. A few of the more common of the mathematical methods of fitting lines or curves are the method of least squares, applied either to the natural date or to the logarithms or reciprocals of the data, the method of moments, parabolic equations, and hyperbolic equations. Each of the above methods of measuring trend as well as others less commonly used are applicable to certain kinds of data. Experience in curve fitting, which familiarizes one with the shape of each different type of curve, is essential before the best curve for any given time series can be readily de- cided upon. One must also have a thorough understanding of the data being worked with.4 Oftentimes a trend may be fitted to data which appears entire- ly reasonable, but a knowledge of the data will reveal that certain factors were at work during one or more periods which caused the data to be abnor- mal. Taking these abnormalities into account when fitting the trend may re‘ sult in entirely different results. The first step in the fitting of a curve to data is to plot the data in order to obtain a general view of the fluctuations from period to period and the general trend throughout the en- tire series. This examination alone will often tell the type of trend that * Cf. FTenk,_Lawrence K., "Long Time Price‘Trends" Vol. XVlll, Sept., 1923. pp. 904—908, in which the interesting hypothesis is advanced that "the long term or secular trend in economic activities is generated by cyclical fluct— uations". And in a footnote to the same article, Mr. Frank observes that "it is an interesting question whether the cyclical fluctuations which gen- erate the secular trends are not themselves genarated by'seasonal fluctuat- ions". See, also, Vol. XX, Dec.1925, pp. 543—545, Smith, Bradford 3., "Error in Eliminating Secular Trend and Seasonal Variation before-Correlat- ing Time Series." ‘1 _ \ 164 is present. But not infrequently it will be necessary to verify this con- clusion before proceeding further. In free—hbnd fitting care should be tench .. EESEZEEEQ In free—hand fitting care should be taken fittin - to see first that.the line drawn describes a straight line or a smooth curve, and second, that it leaves equal areas above and below the trend line: that is, that it passes through the center of the points plotted and conforms to the direction of drift of the whole series. If a straight line is fitted in this way, it should pass through the mean of the whole series at its center.and the mean of the two end-sec- tions of the series. The use of free—hand curves is objectionable on the ground that the method lacks precision. The results obtained depend on the individual, not on the method. However, as used.with certain checks de- scribed later such trend—lines have their value also. Moreover, as will be pointed out presently, the "appropriateness" of any curve obtained by the rigid application of mathematical procedure must in the last analysis rest upon the Judgment- of the investigator. Dr. W. I. King even suggest3 that "the eye is the ultimate court of appeal as to the correctness of the locati.on of the trend". * The advantase to be gained by the employment of objective rather than free-hand methods are therefore continent upon a jud- . icious choice of formula. Such a choice must rest upon an understanding of the characteristics of the line or curve employed and the assumptions under— lying the method of fitting the line or curve to the data. A moving average may be used either to elimi- ’ Moving. ' ‘ nate regular cyclical fluctuations or to smooth averages. ' out random variations. If a cyclical fluctuation 4 is evident, the moving average should be of the length of the cycle.** Such an average will not extend the whole length of the series, but will be short by one item at each end in case of a three—year average, or two at each end in case of a five-year average. Its advantage is in its ease and simplicity of computat- ion and its flexibility. In case of chanPes in the rate or the direction .of the trend, the moving average will follow the new trend whereas a math— ematically fitted line would fail to conform to the change in direction, and a new trend would need to be computed from the point of inflection. This is a somewhat questionable advantage, however, since one never knows whether the change in direction shown by the moving average in such case is merely temporary or has come to stay for a while. A moving average in it- self is no evidence on this point. Episodic disturbances upset moving av- erages severely. ‘* King, Wilford 1., "Principles Underlying the Isolation of Cycles and Trends", Vol. x1x, December, 1904, p. 471.. *fFor an excellent discussion of the characteristics of the moving average as a device for representing trend, see Mills, Frederick C., "Statistical Methods Applied to Economics and Business", pp. 260-271. . 165 TABLE I. MOVING AVERAGES OF HOG PRICES A9 CHICAGO ADCORDING T0 FOUR DIFFERENT PERIODS. Original: 3—year : 4-year : 5-year 7-year YEar :1 Data ': Moving Avg.: Moving.Avg. Moving Avg. 1 Moving Av. 1927 : 10.10 : : : : 1926 :‘ 11.95 : 11.25 - z , : - : 1925 : 11.70 : 10.63 : . 7.42 : 9.89 : . 1934 : 8.25 : 9.13 : 9.48, : 9.68 z 9.55 1923 : 7.45 _: 8.25 : 8.70 : 8.96 : 10.09 1922 : 9.05 : 8.28 : 8.98 : 9.59 : 10.91 1921 : 8.35 : 10.42 : 10.96 : 11.28 : 11.74 1920 : 13.85 : 13.30 : .13.3O : 13.29 : 12.73 1919 : 17.70 : 16.35 : 15.21 : 14.52 :‘ 13.04 1918 : 17.50 : 16.80 : 15.54 : 14.78 : 12.75 1917 : 15.20 : 14.12 : 13.68 : 13.41 : 12.73 1916 : 9.65 : 10.62 : 11.18_ : » 11.51 : 11.92 1915 : 7.00 : 8.28 : 9.14 z 9.65 : 10.47 1914 : 8.20- : 7.80 : -8.00 z -8.12 a 8.92 1913 : 8.20 : 17.98 : 7.70 : 7.52 : 8.02 1912 : 7.55 : 7.47 : 7.74 : 7.90 : 7.71 1911 : 6.65 : 7.70 : 7.74 : 7.75 : 7.53 1910 : 8.90, : 7.67 : 7.42 : 7.26 : 7.22 1909 : 7.45 : 7.37 : 6.37 : 6.96 : 6.94 1908 : 5.75 : 6.42 :, 6.02 : 6.88 : 6.61 1907 : 6.05 : 6.02 : 6.17 : 6.15 : 6.40 1906 : 6.25 : 5.85 : 5.76 : 5.69 : 5.99 1905 : 5.25 : 5.55 g 5.67 : 5.74 z 5.91 1904 2 5.15 : 5.47 :_ 5.75 : ~ 5.92 3 5.94 1903 : 6.00 : 6.03 : 5.92 : 5.85 : 5.79 1902 : 6.95 : 6.28 : 5.99 : 5.81 : 5.48 1901 : 5.90 : 5.97 : 5.74 : 5.59 : 5.28 1900 : 5.05 : 5.00 : ' 5. 10 : 5. 16 : 5.06 1899 : 4.05 : 4.32 .: 4.43 : 4.50 : 4.69 1898 : _3585 : _3.85 : 3.95 : 4.00 : 1897 : 3.65 : 3.63 : 9 : : 1896 : 3.40 : : : : Table 1, showing moving averages of yearly hog prices at Chicago computed according to four different periods, reveals some of the characteristics of this type of trend measurement as applied to a typical economic series. - It is apparent that the cyclical movements in this case are of an irregular length and amplitude,ae is typical of most economic series. If measured from trough to trough, the cycle before 1914'varies from 3 to 5 years. If measured from peak to peak, each is four years in length. ‘After 1914, the outstanding episodic movement caused by the European war reaches its peak in 1919 and the trough of the subseguent reaction was reached in 1921. Had the cycles been of uniform length and amplitude, we would expect a straight line to result from a moving average with a ['4' 15's period equal to the length of the cycle. Had the length been uniform and the amplitude uneven, we would expect a Series of cycles of smaller amplitude to appear. What we get instead is a moving average in which the cycles are still evident, but less pronounced than in the raw data. The five-year moving ayerage differs from the three-year one only in being samewhat smoother. _ Since four years seems to be the nearest the usual length of the cycles, a four-year moving average was computed, as shown in Column 5, and this was centered by averaging each adJecent pair 0f averages, The seven-year'average.;which'is practically equal to the len8th of two cyles; most‘nearly removes the'cyclical influences. Fig— ure 1 shows the effect of the different movihg averages. .- In the period after 1914, With two dominant movements of an edisodic nature and-of unequal length,'no one of the moving averages cowputed results in~a line of trend which seems particularly signifi- cant. -For these periods, the irregular fluctuations were simply smoothed out more or less, depending on the.length of the period cov— ered bY-the average-in question. When the price turned'up sharply after 1914, it will‘be observed‘that the moving'averages, and par— ticularly the one with the longer period, gave a line each point of which is above the actual data for three years and then below for three years.‘ When the price turned down sharply after 1919. the moving average gave*a line which was again above the actual prices. This means that in both of these cases a change appeared in the trend line long before it really happened. ‘The sharp break upward.0f prices in 1916'began to show in the moving average 3 years earlier in the 7-year moving average, and 2 years earlier in the 5eyear moving average. When the-break downward came in'1920, the high prices of the war years remained in the moving average and kept it from reaching_bottom until equally too late. All moving'average series of prices should stop with 1915 and resume about 1925; To continue them.during the dis- turbed period of 1915.1925 adds'only confusion. Anlexcellent example of this-confusion appears in Figure 2, reproduced from Chart B of the report of the Business Men's Cowmission, in which the use of an-ll—year moving average of prices of farm products make agriculture start down- ward in 1914; 5 years before it happened. But it does not take an episodic disturbance such'as that of the World war to make moving av- erages confuse the picturew. Any time that a series is concave up-' ward, the moving average will run ahead of the data, and it will run behind when the series is convex downward. It will be obvious that deviations measured from moving averages are not very dependable when- ever disturbances occur or convexity or concavity appear in the data. In particular, the exact point in time at'which the actual series crosses the trend line will have little significance. A chart such as Figure 3, reproduced from page 344 of the July 1927 Journal of ' Farm'Economics,.in which a comparison of deviations'from the above described 11-year moving average,with another 11-year moving average is made the basis of conclusions as to the relation between agricul- ture and business, the essence of which is the time when the devia— tions occur, must therefore be scrutinized very carefully for falsie fication produced in connection with concavity, convexity and episodic movements. Using a long period spreads these errors more evenly, but over more years. \\‘ 167 3.0K .049 ow_m< .29 Q? .89 . 89 , .BW 3% 33%? win 8N litl... XT 3330: .v‘kmw |.I.....|..t NSVQ $195.0? .. inewfi 3% om. \/ . \xV!“\\\/MW: 09% \\\/J&/.& .. \\ / 71.. .. . _ \ .... ‘ A -83 . ‘ . . , of _ 9E \ / \\\ . \ 3.3 w Q ., .07... 39mm 60: no mmwémzq 02302 /68 0mm. m mm. ONE 9m. 9m. mom. 00m. mmw. 0mm. mwfi owm. fi—~_ 4.. u..— _.._ ._‘. «.u.___._fil;. _—_. .fiau .—~_m mmb 50». T \wQ‘xoz \w «09?? ax . ,1 K .m‘mfixéx $.SfiSusu \Ok3§39.tmuc.b>‘ N .0; . 3 $3me .833on kink $0 036% QQ\H “im‘ n 0‘2 mum. Ow on GM ON. thu mun» mmiimm J¢02_>02 .Er: .3ny gashouagzoz , N M _ 9533mm :mE mo 0E»: , . - 1. M _ _ _ _ kzmummu . mmdx/U mmuzaam azq 3,65 mo_aaq Since the aim is to fit a line which shows: median. the dominant tendency of the series, the' ‘ moving average has the obvious disadvantage that it is subject to the effect of wisely~ . deviating items in the series. %s suggests.. the use of a moving median. If used in a series having??% uctuations, however, the course of the moving median is likely to be erratic. More- over, if the seasonal variation of the series has a marked upward or downward tendency, this will give a corresponding bias to the median values obtained. Seasonal variation should be eliminated in such cases before computing the moving media.** ‘ Fitting b1 ' i It has already been observed that the use of eguations. f a mathematical equation involves an assump— tion as to the fundamental tendency of thev series. A choice must be made at the out: set as to the type of curve appropriate for the given series. This choice should be made in view of what seems to be the nature of the phenomen This does not mean that the an— alysis should be carried through a pro—supposition as to the conclu-.. sions. But it should be recognized that the choice of one out of two or more pOSSible methods of analysis itself assume the nature of :3 the phenomena to be the type to which the chosen method is most suit-1- able. In studying the secular trend of a series of prices, it is poSsible to compute a straight-line trend, or to use any of a large' number of culvilinear formulae. Exactly the same body of data is capable of yielding any one of a considerable number of lines of trend, depending upon the formula used. ' The formulas are useful in determining the rates of change and in describing the series after the system according to which it is varying has been at least tentatively decided. The computation of several forms of trend will help to discover what that. system actually is. But a more rationalistic study Of. the data will be the starting point. The formulas finally decided on will be used in des— cribing the data in quantitative terms to show what it is and how * loc. cit. p. 267. **Cf. King, Willford 1., Ice. cit. p. 474. l7! it fluctuates. The explanation of why it behaves as it does must depend to a large degree on the qualitative considerations added to these quantitative ones. A18°. as Dr. W. I..King observes: "In most instances there is no way of knowing in advance whether trends are segments of straight lines, of curves representing waves of great length, or of mathematical curves of some other type".* It follows that trends determined for short periods can only be tentative and subject to correction as data for longer periods serve to exhibit the underlying long-time tendency of the series. ‘1 further consideration is that the fundamental tend- ency of a time series is itself subject to change, so that the math- ematical law which properly expresses the fundamental tendency of a series for one period may have little or no application to subsequent periods. There is no secure basis for assuming, for example, a straight-line trend for a time series in the future, simply because such a trend has been characteristic of the past. If such an assump- tion is made, it must rest on consideration of an economic sort, not on the general validity of the mathematical law used to characterize the "growth"of the series. Strai t-line . ‘ If the series is changing upqard or downward EEEEQE: ' in a straight line or moving horizontally, an approximation of the trend can be obtained by passing a thread or ruler through the series-as plotted to a natural scale. ,But such a trend will of course be only roughly approximate. If any 'further analysis of the data is to be made, a mathematically deter— mined straight line is needed. - If a straight line is to be determined mathematically. some care should be exercised in the selection of the period to which it is fitted. In the first place, the line should be applied to one homegeneous body of data only. Thus in determining a significant trend of hog prices we could not compute a straight line through the entire period from 1896 to 1927 because it would be distorted by the widely divergent data of the war and post—war period. If it were desired to compute a trend for the post-war period, it should be based only on data after the more temponary and violent influences of the war had subsided. Another consideration is that the direction of the trend line is more or less dependent on the phases of any cyclical or ep— isodic movements with which the included data begin and end. Thus if the period studied were to begin with the low phase of a cycle and end with the peak of another cycle, the trend line of an as- cending series would be more steeply pitched than it should. The trend line should begin and end in corresponding phases in order to be truhy representative of the period included. * 10c. cit. p. 469. 172 The formula for the straight line is y s autbx where (y) is an ordinate or point on the line of trend, (a) is the point of origin of the line, (b) is the amount by Which the trend line rises in each period of time, and (x) is the number of intervals between the point of origin and the h t of time in question. The problem then is to find the values ofa ') and (b) which will determine the straight line most closely approximating the trend of the data given. Method of The most usual method of fitting a straight Least line to a curve is that of least squares.* S uares For certain types of data it is the best vmeasure of trend; but it is probably more often misused than any other method because of its ease of calculation, precision, and simplicity. It should be used only for data with fairly uniform.vnriations_and preferably only when analyzing a large number of observations: Should the variations from trend be very irregular in a small number of observations and one or two unusual variations in the same direction occur near the extreme end of the series, these variations have an extreme influence in de— termining trend because the square of the deviation from the trend is so large that it unduly weights that item, and the trend line this fitted does not represent normal conditions. It is a matter of com— mon knowledge that a shift in the interval for which the calculation is made may give significantly different trends when the lines are fitted to time series by the method of least squares. It may easily be argued, therefore, that the system of weighting involved in the use of the least-squares method is devoid of economic significance when anplied to time series. Dr. Ingraham points out that slight changes in the group of years included produce marked difference in the increments shown by the least square trends.** And he suggests that the discrepancies are due, for the most part, to variations in the amplitude of the flucations of the original series, but also in part to Special characteristics of the least squares formula. In the light of these considerations, Dr. Ingraham suggests that " a line of coordinated means ... calculated from a three—year moving average"*** avoids the instability of the least squares line, and hence is more satisfactory for determining the secular trend. * For variousstylesof equations and their straight-line forms, see Huntington, E. V., "Curve—Fitting By the Method of Least Squares and the Method of Moments", in Handbook of Mathema- tical Statistics, by H. L. Rietz, Editor-in—chief. p. 63; a180 any of the newer textbooks in Statistics. ** Cf. e.g.. Ingraham, Olin, "The Refinement of Time Series", June 1925, Vol. XX pp. 231-233. Hot Ice. cit., p. 233. t4 «2 C): Mr. O. Gressens* has pointed out, however, that the pro— per employment of the least squares method does not admit of an in— discriminate choice of the period to be included. In particular, as Mr. Cressens suggests, attention should be given to the follow— ing considerations:, (a) The concept of a secular trend implies that lines fitted for short periods are necessarily tentative and to be regarded only as approximations, subject to correction when subsequent data more clearly record the fundamental tendency of the series,‘ " » (b) A line correctly representing the trend of a series ' for a given period is not necessarily yalid,forhcthereperiod8. since the fundamental tendency of the series is subject to chang6.. (c) The least squares method gives heavier weight to those items close to the terminal portions of the series. Hence. an 68- pecial bias is introduced in fitting the line unless care is taken to see that each and period is in the same phase of the cycle. A consideration of the statistical propzéfties of the least squares method leads Professor W: L. Cfum "to sug 'st deviCES"by which more Complete knowledge may be gained of the actual significance of the leaSt squares operation in any particular case."** After call— ing attention to the fact that the annual increment obtained by the method of least squares is in effect a weighted average of'ratios, 'Professor Crum proceeds to test the precision of the least Squares method by studying the form of the frequency distribution of the ratios about their mean. He concludes "that the method is not automatically safeguarded. In a number of the Specific instances studied, the process seems to yield results which are satisfactorily precise. In other cases it falls far short. No general rule stands forth: It seems necessary to examine each case with care."*** As a test of the precision of the results to be eXpected from the use of the least squares method in a Specific case, Professor Crum sugs gests the construction of a frequency diagram of the ratios as a step in the computation of the trend. - An application of the different methods of measuring straight line trends to actual data may bring out more clearly some of the problems involved in measuring trend. The data used in Fig.4 are an average of the monthly average price of No. 3 yellow corn at Chicago, from November to May from Nov. 1899 to May 1927. deflated by the Bureau of Labor Statistics allwcommodities index number (19133100). The plotting of the data shows that there is a sharp break in prices between 1919 and 1920, and that the lower price level has continued since 1920, which makes it necessary to divide the data into two series in order to determine the normal price trend during the entire period. * loc. cit. pp.552—553. See also, in am issue, pp.565—568, the report of a "Meeting on Ways of Using Recorded Data to Estimate Future Trends." ' ' ** Crum W. L., "The Least Squares Criterion-of Trend Lines," June, 1925 V01. XX . 213. *** loc.'cit. p. 225. e ‘n, a ‘. \\ W/ Dieflczyed Mu ;o 1140' co‘rq pHces'Na3fyd/gw dingo ”3.5.9 #1941 , cams --~~-- Hirer/by mar/70d o leasr aqua/es. pen '———-fi7fed4y memod of averages “MEL . ufi’fieo’ by /easr squares, 00/ [9/6, /3/7 and ma (am/fled. 3° FIG.4 ' r: 8S" 60 ~—- + W ' /. L / /f7..-/"\/ V .. .1 4O - 1 I; l l _ | I I l L [837 99- 0! 06 . 05'. O7 09 lo I 7 29 MlLL-OJ FIG. 5- ‘ I BUSHELS ' 3200 3009 A A -‘ 2800 \\ . 'AAJ:% . a 'I‘. " /‘r 1600 f4 ‘ ‘ KKK 2.400 :/‘:;/./ L 2100‘ /4 ‘ 7' l V ’ ‘ gun—— Cor-n produqu plus carryover 4&9} ~03} 00/ '-«---—;/= /3567+83.-877x mmxl 1° _: - . ;‘--—-~Log}’-* ssoaw ran/4649 /05/X 'V 1 ~ 1- ' 11.11““. ill'L1‘-t‘_‘a' z 1897 53 Di 03 05 07 09 n :3 15 'n :9 as 23 25 27 29 175 f,-' First a straight -line trend was fitted to each series by the method of least squares, assuming no knowledge of the causes for p the unusual variations which occurred from year to year. The results are represented by the solid lines shown in Fig. 4. It will be noted that the unusual year 1901 is counterbalanced by the low prices of 1899, 1905 and 1906. But the unusually high year 1917 has no low years to offset it, which gives the line too steep a slope. In the shoit series from 1920 to 1936, the unusually low prices of 1921 and 1932 and the very high price in 1924 make the line of least squares indicate that the normal trend of prices was upward at the rate of three cents a year, which makes the normal price of corn to be ex— pected in.1930, when adjusted to the present price level, to be 106 cents per bushel. to A knowledge of the facts, however, shows that the price level from 1916 to 1918 was advancing at a very rapid rate, and as ‘ is usual in a period of rapidly advancing prices, the price of raw products such as corn advanced first. Even if deflated by the all— commodities index, corn prices were higher than they actually would have been in a normal period. It was therefore illogical to‘com- pare the deflated corn prices in 1916 to 1918 with earlier periods because they were not sufficiently deflated. . . Regarding the period since 1920. it must be stated that Where data fluctuate as widely as in this period, a longer series is needed to fit a trend by the-method of least squares. One of the reasons for the wide fluctuation in prices during this period is the great variation in the production of corn. ' The crops of 1920 and 1931 were the largest on record while the crop of 1924 was the small- . est crop produced since the unusually small crop of 1901. With this knowledge of the data available, it yes certain that the line of least squares when fitted to all observations in either series is not an ac urate-measure of the normal trend of prices. -It therefore seemed logical to omit the years 1916 to 1918 from the series of observations,-and since 1901 was a very abnormal year e end of the series, and would tend to raise the lower end of the curve to abnormally high values, it too was omitted; and a line of least squares Ias then fitted to the remaining observations. The results of this line are shown by the dotted line in Fig. 4. It is readily apparent that this line of least squares is a better measure of trend for the years 1899 e 1919 than;the first line used. For the period since 1920, the best that could be done was to assume that the normal annual increase in corn prices from 1920 to 1926 was the same as in the period from 1699 to 1919, and use the average of the seven years as determining its general level. The ratio of price to trend was then correlated with the ratio of supply to trend. This gave a very marked correlation, but the downward trend in residuals indicated that the trend line was still too steep for the data. 176 The next method tried was to average the first nine years, excluding 1901, and the last nine years, excluding 1917, and center the results on the middle year. This method gave the broken line shown in Fig. 4. Correlating the ratio cf price to this new trend with the ratio of supply to it resulted in a materially higher de- gree of correlation and a more even distribution of residuals. Hence the broken line was considered the best measure of the normal trend of price. MEEEE§_2£ ‘ A simpler method of fitting a straight-1ine MEEEEEE‘ - trend is the method of moments, illustrated by Table II."l The equation for the straight line of best fit for these data is y.6.01+.253x. This line may be plotted by taking the point of origin in 1905 as one point and computing the value of any other point to establish the straight line. Values for other years than 1905 may be found by adding the value of (b). Which is .255, successively, or substracting it successively, from the value of the point of origin. The trend values so obtained are shown in column 6. ' ' Table 11. Computation of Straight Line Trend - by Method of Moments - Hog Prices - 1896 - 1914. 1 z 2 : 3 : 4 z 5 : 6 :Original : Deviation : Deviation: Deviation : Ordinates of :’ Items :from Origin: times Orig: s ugred : trend Year : (y) : (x ) :item(yx) : %x.\ : : : : 1 ~ : 189.6 : t 3.40 z _9 z . -30.50 2 81 : $ 3.73 1897 : 3.65 : -e : -29.2oé 64 : 3.99 1898 : 3.85 : -7 : ~26.95 : 49 : 4.24 1899 - : 4.05 : ~6 : ~24.30 i 36 z 4.49 1900 : 5.05 : -5 : -25.25 i 25 : 4.74 1901 2 5.90 : -4 : -28.60 : 16 : 5.00 1902 : 6.95 : —3 - : ~20.85 : 9 : 5.25 1903 : 6.00 : -2 : ~12.00 : 4 : 5.50 1904 : 5.15 : «1 r : - 5 15 : - 1 : 5.76 z : : (-197.905: : 1905 : 5.25 z 0 3 o : 0 :. 6.01 1906 : 6.25 x 1 z 6.25 : 1 : 6.26 1907 : 6.05 : 2 : 12.10 : 4 : 6.52 1908 : 5.75 z 5 : 17.25 : 9 : 6.77 1909 : 7.45 : 4 : 29.80 : '16 : 7.02 1910 : 8.90" : 5 : 44.50 : 25 : 7.28 1911 : 6.65 : 6 : 39.90 : 36 : 7.58 s 1912 : 7.55 : 7 : 52.85 : 49 2 7.78 ~ 1913 : 8.20 : 8" : 65.60 : 64 : 8.03 1914 : ,‘8.2O : 9 z 73 80 : 81 : 8.29 : ‘- : : Z 342. 055: : SUM; £x=fllig~5r ixy: I‘M. l5 {XE 57° éXy; bfix‘ Zy:na b‘fifl a :31 ' (X‘ n .4. 1‘ 5, .1": 4970 .x’/ 177 QETVES * With absence of known influrences indicating curvilinear trend, it is usually safer to assume a straight—line projection than a cur- vilinear one. But at other times the use of matlematical curves is justified. Curves fit: t9? by mathematics are nominate only when the constants used in fitt- 1“ thsse curves have some ielaticnship to the data. When a gun is first under uniform conditions, the trend of the bullet can be traced with considerable accuracy by a mathematical formula because the effect of a certain slope of the gun and the pull of gravity are constant and the results of these factors can be measured: But in the ever-db nging economic world it is impossible to determine constant factors which can. be deed in a mathematical formula and measure what the trend of data has been and what it will be in the-future. To be sure, constants can be obtained by applying a series of observations to a mathematical for— mula, but these constants are not what caused the trend of the data to be what it was. In fitting a mathematical curve to economic data, one is not trying to determine the laws underlying the observed quantities, but rather to obtain a formula by which the trend of the data may be approximated. . While mathematical curves are somewhat elastic according to the size of the constants in the formula, they have a definite gen- eral shape for each type of formula, which cannot be overcome by the data used: with the result that the-curve often departs.at some point or points throughout the data from the most probable normal trend of the data.** . Series of economic data are often found to conform to the exponential or compound interest type of curve. That is, the curve when plotted on a natural scale becomes steeper but at a constant rate, (or falls with a slope that becomes constantly less and less). This type may be identified by plotting the series on semi-logarithmic paper. If the series is of this form, it will aesume a straight line on the semi-log scale. Plotting the logarithms of the original data to a natural scale will give the same result. The formula for the straight line now becomes: log y a a+bx, and the curve may be com— puted by the usual straight-line method merely by using the logs of the (y) ordinates instead of the (y) values in their natural form. This is illustrated in Table III, in which an exponential curve is fitted to the prices of land in Story county, Iowa, for the period from 1904 to 1914. The sum of the logs of (y) is found to be 22.15220. The number of years is ll. . Therefore the log of the ordinate of trend at the point of origin is 2.01384, or in actual numbers 103.38. The sum of the (x) items multiplied by the logs of * Contributed mostly by Purves and Hefkins. ** See Mills pp.280—306, for discussion of methods of fitting the various mathematical curves. .‘Also particularly see pp.300-302 for an excellent statement of tests for the use of the different curves. T‘ 178 the (y)'e is 3.86677, and the‘sum of (x2) is 110. Therefore the log of (b) is the log. 03515 x. The formula thus becomes log y - 2.01384 4-.03515x. Reducing it to natural numbers, y 3 (103.38) (1.0841)x. It is obvious that a straight line would not fit these data, and that a distinctly curvilinear influence is at work in them. The compound interest or exponential trend gives a reasonably good fit. Table III.- Log Curve Fitted to Price of Land in Story . County, Iowa. 1904-1914. Computed (b) (a) _ V 2 Ordinates Year Y ‘Log Y ‘X ' X.Log Y X of Trend 1904 $72.27 1.85896 . -5 ~9.29480 25 $68.88 1905 74.18 1.87029 2 ' 44 ' -7.48116 16 74.69 1906 - 81.75 1.91249 ‘—3 —5.73747 9 80.98 1907 86.10 1.93500 ' -2 «3.87000 4 87.81 1908 90.95 1.95880 -1 —1.95880 1 95.21 1909 104.15 2.01766 -0 —28.34223 103.38 1910 109.13 2.03800 +1 2.03800 1 111.94 1911 122.55 2.08831 .13 4.17662 4 121.38 1912 128.13 2.107661 +3 6.32298 9 131.61 1913 142.44 "2.15376 +4 8.61500 16 142.70 1914 162.64 2.21128 *5 '11.05640 25 154.74 +32.20900 22.15220 >r3.86677 110 Log y - (inf-bx . a. 3 (Log 2) b I X.I.IO n x = 22215220 3 3.86677 11 110 = 2.01384 (in logs) a .03515 Log y - 2.01384 + .03515x One would naturally be cautious about projecting a trend line rising in a curvilinear manner Very far into the future. But experience has justified it within limits repeatedly. A genuine exponential trend fl' ‘ . *>—,‘- "3.. ~ 1" , .. - ~-~ ' .' 179 msy be present as a result of some influence causing a growth at a constant rate. If this is the case, the series will continue to follow the compound interest or log-curve until something happens to change the rate of growth. I - ' Other types of mathematical curves, and the difficulties of fitting them, are shown by the following illustrations, based on the.unrevised final estimate of corn production from 1897-1927, (plus the carryover of old corn on farms November 1. (See Figure 5). 'An examination of the data after plotting shows that corn production increased quite rapidly during the first part of the period, and con- tinuod at a decreasing_rate until 1921, and then decreased materially. Acreage, however, has not increased since 1912 and has remained fair— ly constant since that time with a slight'decline from 1924 to 1927. As this decline has been largely due to increasing.cotton acreage, decreasing cattle and hog production, and low prices of corn and un- favorable weather conditions at planting time, it is likely to be more or less temporary and therefore does not Justify a downward turn in the longbtime trend of production. The unusually large supplies from 1920-1923 were due to high yields and‘the larger carryover of old corn ”in those years and-were also temporary. These circumstances suggested that a curve increasing at a decreasing rate until 1912 and then remain- ing at about a constant level would be the best measure of the normal trend of corn production from 1897 to 1927. First, two curves of the hyperbolic type were fitted. The solid line in Figure 5 is the result of applying the formula' " " . -, Y- : axb , to the data, and the broken line the results of applying the eduation Y = _____l_,__ o _ . a-+-bx; - , The first curve advances too rapidly for the firet'few years and cons tinues-too strongly in the later years. It nowhere fits the data. The second curve is a much-better measure of trend during the last twenty years, but has too steep a slope during the first five years and is too high during the second five years. When these.curves are extended into the future, the difference between them continues to increase, preducing ‘wider and wider difference in the conclusions which would be drawn from them. ' “_The parabolic type of curve, Yr- a_+ bx - cxz , is shown by the dotted'line in Figure 5. This type of curve_tends to be U-shaped. After the value of X increases to the point where the ratio of b to c equals the ratio of X to X2, a change in direction occurs which, in data having a normal trend similar to these, makes the curve leave the data, 'Increasing the constants or raising them to higher powers would probably give a curve that'would still more accurately measure the normal trend-of corn supplies. It should be 180 clear by this time. however, that the constants used in the curves are not measures of factors which determine the trend of the data, but rather the results of fitting several observations to the for- mula. . ' A generally satisfactory test of the fit of the two curves may be obtained by comparing the standard deviations of the differ- ences between the points on the curves and the original data. The _ closeness of the line of -trend to the points which were to be fitted cannot, however be taken as the sole consideration. Conceivably it may be possible to compute a curve which will pass directly through each of the points concerned. Yet Such a curve would be of little or no use. lhat is desired is a smooth curve which will sweep through the points plotted and will indicate as Simply as possible the general- trend of the series. The greater the simplicity of the " L formula and the easier the computations involved the -better. The curve should be explainable in terms of the forces which seem to be causing the fluctuations in the series concerned. It may well be more or less a coincidence that the nonnal trend of the data and a mathematical curve follow the same path, not due to established re—' lationships, but to accident. It does not follow that this coin— cidence is going to continue. ' “' ” ‘ ‘ . When a second degree parabola was fitted to the data in Table III on land values in Story county, Iowa, it gave a decidedly better fit than the exponential curve. The standard deviation of ' the_differences from the parabolic,curve.Was $2.47; from the expon- ential curve $3.27. It is quite possible that a third-degree par-' abola or some other form of curve would have fitted the data in question more closely than the second-degree parabola. But, es- pecially if any attempt is to be made at extrapolation or the pro- Jection of the trend into the future, the smaller the number of constants involved in the formula the better. The more complex curves are more likely to fit the data within the period covered, but generally are harder to explain and less dependable in projec- tion. Indeed, it would seem not unreasonable in the case at hand that the exponential curve might really be the preferable line of trend because more in keeping with the facts which formed the under— lying reason for the movement in land values in the region. ' The fact that the trend in this case may .not unlikely have been distorted more or less by temporary conditions, suggests that a line of trend should be interpreted with caution. Ag new and more recent data become available, the trend lines should be reconsidered from time to time. Frequently it will happen that a trend line will include more than one homogeneous period. 'lfter a change in direc- tion of trend has occurred, it is likely to be some time before the student can be sure either as to the fact of a permanent change in direction or as to the amount of the change. Thus, as explained in discussing the straight line, it may be necessary to divide a period of time into shorter ones and to fit to each one a different trend, and probably to use different formulae in the process. 181 Locating trends In their search for more rational methods of b correlation - fitting trends, economists have turned to the éEélKElfl devices of including a variable to measure secular trend in multiple correlation anals ' ysis. Secular trend is usually taken as a function of a series which increases one unit increment in magnitude with the passage of each unit of time in the series. The residuals ‘ of the analysis are plotted around the regression line of trend deter— mined in the analysis and a curve fitted to the residuals. Where a large percentage of the variation can be accounted for in an analysis of this sort, it is assumed that the effects of the other variables are held constant by the method of multiple correlation, leaving only the relationship between the variable and time. The deviations from a curve fitted to the regression equation measuring the relationship between time and the variable being studied are due to other factors than those included in the study.* Combination The objections to least squares can often be methods* ' remedied by taking a three or fiveeyear mov— ing average of the data and applying the method of least squares to the results. In this way the wide variations are eliminated and the ef- fect of any extreme obserVation is spread through three or five years, which tends to offset any undue influence it would have. Another ad— Vantage of applying the moving average to data of this kind is that its flexibility will reveal any tendency the data may have toward a curvilinear trend, which might not be noticeable in studying a graph of data with wide variations. Of course all of the limitations of moving averages above described are pertinent here. Another common combination of methods is fitting a curve to, a moving average by free-hand method. In data having wide varia— tions, one or two extreme observations may throw the moving average out of line. Where sharp breaks or changes in direction occur, the moving average is also thrown out of line; hence the necessity for smoothing these departures from normal. Irregular length of periods causes similar difficulties. With these limitations of a moving av- erage in mind, and a thorough knowledge of the data being worked with, it is often-possible to fit a smooth curve to the moving average more accurately and with more consistency than by mathematical curves. A ‘ forecast made from such a curve will be on a safer basis than if upon a mathematical curve alone. Free—hand curves can be checked for accuracy in the same manner as a straight line is fitted by the method of least squares. The sum of the squares of the deviations cannot be made equal to the smallest possible amount that is possible for any curve, because a * IllustratiOns of this method of measuring trend are to be found in U.S.D.A. Bulletin 1440 - Factors Affecting the Price of Hogs, G. C. Haas and M. J;'B. Ezekiel and U.S.D.A. Technical Bulletin 50 - Fac-' tors Affecting the Price of Cotton, 3.3. Smith, pp. 22-23 and 51—53.. ** Contributed by Purves. \' 182 curve could be made flexible enough to pass through every point making the sum of the squares equal to zero. However, with the general shape of curve best adapted to the data in mind and the degree of flexibility desired, it is possible to fit a curve of known shape and flexibility so that the sum of the squares of the deviations will be the smallest possible for that type of curve. Seasonal The problem of seasonal variation. like that variation* of secular trend, is to devise a method of measuring the normal seasonal and then to "eliminate" the effect of this type of fluc- tuation from the series. Certain methods are limited to measuring the ayerage seasonal influence, while others are designed to obtain indexes which will reflect changes in the seasonal factor over a period of time. Examples of both methods of measuring the seasonal factor are included in the summaries given below. The method of monthly means. This method of obtaining an index of seasonal variation, although long in use, has received re— newed attention.* The method has the advantages of brevity of comp putation and simplicity of procedure. The procedure may be sums marized as follows: Obtain a total for all the January items in the series and corresponding totals for each of the remaining eleven months. Sum the totals obtained and divide by 12 to obtain the av— erage monthly total. Compute relatives for each month by dividing each monthly total by the average monthly total. To eliminate the seasonal factor, divide each item in the original series by the cor- rGSponding monthly relative. This method of course takes no account of changes in the seasonal factor; Moreover, as a method of representing the average seasonal, it has three important limitations. (1) The use of the arithematic average allows an exceptionally high or low monthly item to exertan undue influence on the determination of the typical value for that month. Influences producing irregular fluctuations will therefore be reflected in the seasonal. (2) If the cyclical move— ment is large in comparison with the seasonal movement, the influence of the former will tend to dominate and distort the seasonal index. This tendency will be more pronounced in short series, although there is no assurance that the cyclical will ayerage out over a longer period. The seasonal index obtained is therefore subject to the in- fluence of cyclical fluctuations in the series. (3) If the series has a pronounced trend upward or downward. this also will affect the seas— onal index. An upward trend of the series would make the January re- lative too low and the December relative too hi ah. whereas a downward trend would have the opposite effect. - The considerations indicated in (1) and (2) may be regarded as placing definite limitations on the use of this method. These considerations suggest that the method is inapplicable to series in *‘fificntributed by Clayton. ** Cf., e. g., Hart. William L., "The Method of Mbnthly Means for De- ' termination of a Seasonal Variation". Sept.. 1922, Vol. XVIII, :, m 341-349 . c 183 which the monthly items are widely dispersed about the monthly mean and also to series having pronounced cyclical fluctuations. As to the considerations in (3), a further step may be added to the four previously given to approximate the influence of secular trend. A method*of doing this is as follows: Assuming the period covered by the series is 1910—1919, subtract the item for January, 1910 from the item for January, 1920 and divide the remainder obtained by twelve. Add (or subtract) one-twelfth to the beruary total, two—twelfths to the March total and so on for the remaining months. Median-link—relative method (Persons). This method for determining the average seasonal factor was devised by Professor Warren M. Persons and**is widely used. The essential steps in the computation of an index of seasonal variation by this method are as follows: (1) Calculate month-to—month link relatives for the entire 'seriee. ~ I (2) Arrange the link relatives obtained in a multiple fre- quency table. - . The closeness of the clustering of the items in each column gives an indication of the precision with which the seasonal can be measured by the use of an average. The displacement of groups between adja- cent columns indicates the direction or rate of the seasonal change. (3) Determine the median link relative for each month. ‘For short series especially, the three middle links should be aver- aged to obtain the median value. (4) Express the twelve median—link relatives as percent— ages of a constant base by computing chain relatives with the January link as a base. (5) Correct the chain relatives for trend. If no trend is present, then the product of the January link by the December chain relative should be 100. When trend is present, the chaining of the links has the effect of introducing a cumulative error, Compounded over the twelve months at a rate equal to the rate of increase of the secular trend. The rate may be determined by employing the compound interest Imam”l specifically, as follows: * Cf. King. W.I., "An Improved Method for Measuring the Seasonal Factor",-Sept.§l924, Vol. XIX p. 301. Fbr the use of the least- squares trend for the same purpose, See Davies, George 3.. Economic Statistics, p. 119. ' ‘ ** Articles by Professor Persons describing the method will be found in the Review 3: Economic Statistics, January, 1919, pp.18—31, Journal of American Statistical Association, June, 1923, Vol.XVIII, pp.713-725, and in the Handbook g: Mathematical Statistics, edited by H. L. Rietz, pp. 150—165. ‘ *** or. Mills, r. 0., Statistical Methods, p. 320. 184 January link x December chain . 100 (l—t'r)12. ‘ The chain relatives may then be corrected for trend by dividing the February chain relative by (l‘f r), March by (1-+-r)2, and so on. ,(6) The corrected relatives give an index of seasonal var— iation on January as a base. To convert this to an index showing the monthly seasonal as a percentage of the average for the year, divide each monthly relative in (5) by their average. The application of this method involves extensive calcula- tions, and its use, in lieu of less laborious procedure.shou1d be justified by compensating advantages with respect to the validity of the results obtained. It has certain advantages over the method of monthly means. (1) The median, since it is unaffected by large or small chance variations, is less subject than the mean to the in- fluence of irregular fluctuations. Except for series having special characteristics,* the median Value is a more typical average than the arithemetic mean. (2) As to the influence of the cyclical factor, it is suggested that "there is an indirect tendency to minimize the in- fluence of cycles because of the comparison, in the link-relative operation, of each month with that immediately preceding. This fact tends in general to free the link relative from cyclical disturbance.** , (3) The correction introduced for secular trend has been indicated above. ' ‘Ratio-to-ordinatg method (Faulkner). This method for com- puting an index of seasonal variation was devised by Dr. Helen D. Feulkner.*** The essential steps in the ratio-to-ordinate method are as follows: 1. Fit an appropriate line to represent the secular trend of the original data. 2. Compute the ratios of-the original values to the ordin- ates of the line of trend. . ' 3. Form a multiple frequency table of the ratios. * Cf. Hart, William L., loc. cit., pp.343-345; Persons, Warren M., loc. cit. p. 718; Crum, W. L., "The Use of the Median in Determ- ining Seasonal Variation", March, 1922, Vol. XVIII, p.607-614, ** Faulkner, Helen D., lec. cit. p. 169. *** loc. cit., pp.167-179. A similar method, in which the resi- duals "obtained by subtracting the ordinates of secular trend from the original data (are) expressed as relatives of the ' secular trend and then used in the calculation of seasonal variation", devised by Dr. Lincoln W. Hall, appears on pp.156~ 166 of the same issue. 185 4. Determine a typical-value for each month by taking the arithmetic average of a group of (three to seven) ratios which in- cludes the median and an equal number of ratios above and below the median. ' ' 5._ Adjust the twelve typical ratibs obtained in 4 by di- viding through by their annual average to obtain a final seasonal index having an annual average of 100 per cent. Dr. Faulkner points out (loc.cit. p.174)_that this method "makes adequate allowance for secular trend, for the trend is actually eliminated from the original data before the computation of the seas-' onal variation. Moreover, it provides against the influence of the irregular deviations by using a group of middle items, in any one month, from which the irregular extremes are omitted. The provision against the effects of cyclical fluctuation consists chiefly in the insistence upon a series covering a longrtime interval, but the proper selection of the grouping used in getting the crude indexes tends to minimize the disturbing effects of non—uniformity in the cycleS". The method has the further merit of euoe’or computation: The three methods Just discussed aim at obtaining an average seasonal index applicable alike to all years covered by the series. The fact that the seasonal factor is itself subject to change has led to the develoPment cf methods designed to take ac- count of the changing seasonal. Three methods of this type are presented below. The moving median method (King). In proposing this meth- od, Dr. King's aim was "to derive an index which will show the trend of the changes accruing in the seasonal factor from year to year. . . It gets away completely from the crude assumption that seasonal fluc- tuations are unchanging from year to year.* The essential steps in this method are as follows: (1) The original data are plotted. (2) A preliminary cycle-curve is fitted to the plotted points by freehand methods. To assist in locating this preliminary cycle—curve, an approximate seasonal index obtained by the method of monthly means is used. (3) Figures for each month are read off from the pre- liminary cycle-curve to obtain a tentative estimate of the cycle. (4) The original monthly figures in (1) are then divided by the corresponding monthly figures in (3) to obtain an approximate- seasonal index for the various months throughout the period. * King, Willford 1., 10¢. cit., pp. 304,305. 186 (5) To smooth out the irregularities, a nine—period moving median is employed. The median January ratio for the first nine years is written as the seasonal’figure for the fifth year and so on. The process is repeated for the remaining eleven months. (6) Monthly averages of the values in (5) are obtained for each year. Each monthly value in (5) is expressed as a percentage of the monthly average for the yen.r. The sum of these percentages should be twelve. The curve of these percentages is smoothed to obtain the required adjustment. The adjusted percentages provide a separate normal season index for each year during the period. (7) The itoms for each month may be eyeragod to obtain an average seasonal index. but this stop is not of course contemplated as an _object of the method. (8) To obtain the final estimate of the cycle, each month- ly item in (l) is divided by the seasonal index (6) for the corres- ponding month of the same year. Obviously, the results obtained by this method are con- ditioned hy the Judgment and the technical «roficiency of the indiv- idual in employing the freehand methods of fitting and smoothing curves which the method requires. It is not probable that two come patent persons applying this method to identical data would obtain precisely the some results. It does not follow, however, that the precision of results obtained by strictly mechanical methods would necessarily yield a more, or even equally, accurate measure of the seasonal factor. A method which nmasuros the seasonal factor with approximate accuracy is certainly to be preferred to methods which. yield mechanically_precisc results, but.which fail to represent ac- curately.the seasonal fluctuation. A serious drawback of this ' method is that it necessarily fails to yield index numbers for the last few years .of the period. The index is never "up-to-date". The method, therefore, leans heaviest on approximations where the consequences of error have greatest practical significance, namely, for the period touching-the immediate present. - As’a method, it helps least when help is most desired. It may be said, however, that any method which accurately measures the seasonal for the recent past will serve very well for current years, since the season- al factor is relatively constant for short periods. The method of adjusted monthly ratios (Gressens). This method. like the method proposed by Dr. King, is "based upon the consideration that yearly fluctuations take place in the seasonal".* Mr. Gressens believes, however, that the greater reliance of the present method on a definite mechanical procedure makes it superior in this respect: to Dr. King' 3 method. * Gressens, 0.; "On the Measurement of Seasonal Variation", June, 1925, Vol. XX, 1). 207. ' 187 The following steps are used in the computation: (1) Chart the original data. If the data exhibit no pro- nounced trend, eXpreSs each monthly item as a percentage of the aver— age monthly value for the particular year. If, however, the data exhibit a pronounced trend, these ratios.do not sufficiently allow for the monthly trend, and must.be corrected. An alternative pro- cedure is'therefore recommended when an inspection of the chart of the original items reveals a.pronounced trend in the data. This con~ sists in expressing each monthly item of the original series as a percentage of the corresponding monthly-ordinate of the secular trend of the entire series. The ratios obtained are free of the annual monthly increment_due to trend, but are still subject to the influence of irregular and cyclical fluctuations. (2) The monthly ratios are therefore smoothed by using a moving average —— mean or median --'of an appropriate period. In the illustrative example a five-period moving median is empIOyed. (3) Any irregular items are then moderated with respect to adjacent items for the particular month. (4) The smoothed ratios are extended forward and backward to obtain terminal values for the series and are adjusted so that the' sum of the ratios for each year is equal to 1200. ' Other methods. The methods summarized will sufficiently illustrate the procedure employed.in handling the problem of seasonal variation. For the most part, these methods are directed to the prob~ lem of characterizing the seasonal movement of a series with a view to the ultimate isolation of the cycliCal fluctuation. A method directed more specifically to the study of the changes in the seasonal factor over a period of time-with a view to the development of seasonal stand- ards has been devised by Jrofessor Crum*. When applied to series in which Professor Crum's modified monthly mean device may appropriately be employed, the method admits of a direct and simple application to isolate the cycles (Ioc. cit., p. 60). The process does not require separate calculation and elimination of the trend and saasona1.'fl But for series requiring the use of the modified link relative method. the elimination of the seasonal factor from the series would'prove rather laborious, even though the "centering" process proposed by Professor Crum for the determination of the seasonal for a given, date is employed. * Crum W. L., "Progresaive Variation in Seasonality", March, 1925, Vol. XX, pp.48-64. ‘ ** Cf. Brumbaugh, Martin Allen, Direct Method of Determining Cyclical Fluctuations of Economic Data. Dr. Brumbaugh proposes a method, based on certain special assumptions, for isolating the cycle in which, Also no attempt is "made to segrate and measure either'ths trend or the seasonal variation". (loo. p. 15) K 188 CYclical There remains to consider the problem of Movements* separation of cyclical episodic and chance fluctuations. As a matter of practical observation, it is generally impossible to separate these frdm each other. All that is usually done-is to find the residuals after secular and seasonal movements are removed and then to examine these to discover evidences 0f CYOIGS. This process is illustrated by Table IV. Here the re- siduals are expressed as percentage deviations from the corrected secular trends. Vivid swings back and forth across the trend line appear at irregular intervals. An examination of the data on the size of the corn crop in the United States'shows that the outstanding reason for these variations is in the fluctuations in the production of corn from dry or wet years.‘ A second influence is found in the number of hogs in the country, which determines to a large degree the demand for corn. Thus in 1905 and 1906, there'were large corn crops and the resultsrare seen in the variations of the price below the trends in the periods Just following. In 1908 there was a moderate sized corn crop, but a large hog crop to consume it and consequently high prices. In 1909, 1910 and 1912 there were.large corn crops and conseQuently declining prices. In 1911 there was a small corn crop and a large hog crop. and consequently a high price. It might be said that the variations just pointed out are entirely attributable to episodic movements. Or it might be said that the irregular wave—like motion which runs through the series is cyclical fluctuation, while the more abrupt and short-lived varia- tions are episodic in type. Assuming that the latter is correct, still it will be impossible to separate the two types of variation or to measure either one without making some very arbitrary'as- sumptions. Making these assumptions, the short—time fluctuations of two or three months in length may be smoothed out and the cyclical and episodic deviations made to stand out-more clearly by means of a moving average. ~-Usually a.short—time average of 3,5 or 7 months is preferable as not smoothing out too many of the shorter episodes. The resulting wave-like curve shows the main movements of the prices which may be studied, each separately. in the light of the record of the crop yield and the history of related enterprises. Since these fluctuations seem mostly attributable to the fawcrable or unfavorable seasons, it may be possible in this case to arrive at some general statement of the relationship between the season and the fluctuations in corn prices. But in episodic movements of most other types, this is not likely to hold true and each episode may be caused by differ-. ent influences and therefore call for a separate study. * Contributed mostly by Hopkins. 189 Table VIII. Cyclical Variations in the Price of Corn, ' "1904-1914. 1 ' 2 3 - 4 5 .Seculaf Sec. Trend Price-i~ % Deviation Price Trend Cor. for Cor.Sec. from Trend »Season Trend 1904 Jan. .451 . 43': ' .438 .1.03 «05. 12‘. .501 .499 .455 1.10 +.1O Mar. .52? ’ .430 .480 1.10 1.10 Apr. .527 .492 . . .472 1.12 r.12 Mby ’ .486 .494 .494 -.98 ~.02 June .488 .495 .490 1.00 .0' July . 486 . 497 ' . 512 . 95 -. 05 Aug. .535 .498 .538 .99 —.01 Sent. .528 .500 .550 - .95 -.04 Oct. .536 .502 .522 1.03 +.O3 Nov. _ .541 .503 .518 1.04 - +.O4 Dec. .452 .505 .485 ~ .95 - —.O5 1905 Jan. .426 .507 .456 .93 -.07 Feb. .441 .508 .472 i .93 —.07 Mar. .470 .510 .500 .94 —.04 Apr. '.478 ’.512 - .492 . .97 -.03 May .562 ' .513 .513 1.10 +.10 June .542 .515 .510 1.06 ..06 July .534 .516 .531 ' . 1.06 +.06 Aug. . 50 .5;8 .559- .98 —.02 Sept. .528 .52 .572 .92 —.08 Oct. .522 .521 .542 ~ .95 ' —.04 Nov. .435 .523 .539 .90 3.10 DOC. .451 .585 .504 .92- —.08 1906 Jan. .430 528 .473 .89 —.11 Note: The foliowing years are similar to the two shown above. Since the purpose of this table is illustrative, only the first two years are shown. ' Periodicitz* Recent students of periodicity in economic series seem to think that true periodicity with the possible exceptional seasonal var— iations is not to be found in economic data.** Professor Crum writes "In economic science, barring the case of season- al variation which is sometimes for short intervals very closely por— * Contributed mostly by Killough and Wertz. - . ** Day, E.E., Statistical Analysis,(1925)0hap. XIX.‘p.306; Cnmn,W.L., and Patton, A.C., Economic Statistics,~p. 384.7 190. iodic, true periodicity is unknown.* It is also probably quite ques— tionable as to whether an exception need be made of seasonal variation, f for it is very doubtful whether economic values repeat themselves in exactly the same months, or weeks and certainly very seldom on the same days. The price of eggs is usually higher in December than in January. but this not not true invariably. It certainly would be futile to adhere rigidly to a definition of periodicity in the treat— ment of economic data. Through absolute periodicity is not to be found in economic data with the possible exception of seasonal variaw tion, it does not follow that it is not important to study more-or— less periodic movements. Much has been gained by the study of correl— ation between the series, yet very seldom, if ever, is perfect correl- ation found. It is important, for example, for the former to know in what month or months the price of wheat is usually highest and the price of commercial feeds lowest,ovon though this average does nor re- present the true situation for every year. If it is assumed that the fluctuations are of a cyclical nature, it will probably be desired to find out as much as possible about the lengmh of the cycles and their amplitude. It is very sel— dom that a series of economic cycles is encountered with anything like a uniform length. Regular periodicity is therefore hardly to_be hoped for, and it will be necessary to ascertain the length of each cycle separately. There is no certain and fixed method for such measurement, largely becauso we are not sure as to Just what constitutes-a cycle and where cyclical variations leave off and others begin. The-degree of regularity of'time intervals in a given series of data may be determ- ined by measuring the intervals between successive crests or between successive troughs of the plotted data and by tabulating the results ' in'a frequency table. Better than either of these, the length may be counted in months or years from the point at which the cyclical curve cresses the trend line until it crosses it again in the same direc- tion. This method applied to the data of Table IV after they are embothed by a 7-month moving average would give three periods of 57.21 and 33 months respectively. Such variations in the length of cycles are not unusual. They illustrate very well the difficulty in the way of discovering any typical or normal length of cycle. Attempts have been made to construct a typical cycle for various series, but in most series it has been without much success. In the case of the hog—price cycle, which is regarded as one of the most regular in agriculture, the average length of six cycles from 1880 to 1913 was about five and a quarter years, but the individual cycles varied from 113 to 37 months. No cycle in the whole period was within six months of the average length. The idea of regular periodicity is there- fore quite untenable even for this series. * Crum, W. L., "Cycles of Rates of Commercial Paper", "Review of. Economic Statistics", Preliminary Volume V.. p. 17. ,_.1:' ., ; The study‘of the cyclical and-episodictvmaeions will there- fore need to go deeper than to postulate a regular oscillation equal to the average cyclical duration. Any worth while analysis will involve a careful study of events which accompany or precede theSe variations in related series, or which, perhaps, are not expressed quantitatively in any series. - There will also be a wide difference in the amplitude of ‘ different cycles as well as in their length. There will be as great a difference in the amplitude of the variations in different series; For instance, as compared to the variation of 25 to 30 per cent above and below the line of trend in the corn price series from 1904 to 1914. the variation in the index of all commodity prices in the same period was only 8 or 9 per cent. 1 The simplest method for comparing‘the cyclical fluctuations of different series of data is to plot them for visual comparison of regularity and of duration and amplitude of fluctuations, and for visual estimation of timing or lags. More precise and complicated methods are as follows: ' 1. For comparing amplitudes of cyclical_variations in different series of data, standard deviations may be computed of the relative deviations of each series from lines of secular trend.* 2. Another method for comparing amplitudes of fluctuations of the several cycles of a series of data has been used by F.C. Mills.** Mills expresses the rise of each.cyc1e as a percentage of the ensuing high, the fall as a percentage of the preceding high, and calls the average of the two results an index of cyclical variability. An ad— vantage of this method is the fact that it obivates the necessity of eliminating secular trend. ' 3. Comparative.t3ming or lags of different series may be expressed as deviations from selected reference dates either in the beginning of revival or in the beginning of recession.*** Another method sometimes employed to measure lag is to compute several co— efficients of correlation between two series, pairing the items with different lags. The lag which gives the highest coefficient is taken as the most representative.**** ' . 4. The average_conformity of variations in time series in direction, duration, amplitude and timing may also be measured * See Mitchell. W. C. "Business Cycles: The Problem and its . gSetting", 1927, pp. 278—278.' , ** Mills, F. 0., "The Behavior of Prices", Publication No. 11 of the National Bureau of Economic Research, p. 84. *** Mills, F.C., "The Behavior of Prices", Publication No. 11 ' of the National Bureau of Economic Research, p. 92. **** Mitgggll, WLC., "Business Cycles: The Problem and its—Setting“,_ , Po 0 ' I ”‘1‘ 192. by correlation. The use of this statistical method by one who is not familiar with its limitations, especially as applied to time series, may lead to erroneous results. If used with judgment, hOw- ever, correlation is a very effective method for comparing the aver- age conformity between the deviations of- different series of economic ‘data. In the correlation of time series for the purpose of compar- iné‘CYC11Cal variations, the data to be correlated are commonly re- duced to the form of variations from secular trend or of first diff-. erences. ' ' If two or more unlike series (crop yields, for example) are found to have similar cycles emanating from common causes it may be desirable to make a synthetic curve by combining the several series into one composite curve. ' For this purpose it may be necessary to eliminate secular trends of each Series by expressing the residual cycles as deviations from trend. Theso deviations may then be ex- pressed in terms of a common unit (standard deviation) before the series are added.' This rather complicated process tends to equalize the weight given to each series of data entering into the synthetic curve.~- , ‘ Numerous studies of the business cycle have centered on the problem of demonstrating periodicity. Professor H. L. Moore, in a series of. articles appearing in the Quarterly Journal of Economics."I attempts to establish an e ight-year period for the business cycle based upon the theory of periddic generating economic cycles whose .ultimate cuases are to be traced to meterological effects due to the .periodic motion of the planet Venus. What first challenges the attention in the results obtained by Professor Moore and others is the wide variation in the conclusions reached respecting the typical duration of the cycle. ** The conclusions reached by Professor Moors Thave been challenged*** on the score of the periodogram analysis which is employed. It is characteristic of most economic series that they are relatively short and that their fluctutations vary in amplitude, timing, and shape. The poriodogram analysis is quite unadapted to the study of such series.**** The fundamental limita~ tion of the .periodogram analysis appears to be that its use implies the assumption that periodicity exists. ***** A further limitation upon the use of periodogram analysis (which, as noted. is applicable * Generating Cycles of Products and Prices", February, 1921. Vol.35, pp. 215-239; "Generating cycles Reflected in a Century of Prices", August, 1921. Vol. 35. PP.503-528; "The Origin of the Eight~Year Generating Cycle". Nov., 1921.Vol.36,pp.1-29. ** Cf. Thorpe and Mitchell, Business 43mm, pp. 38—45. **‘ "All data could be explained as well by a theory of cycles which merely assumed that in general the length of cycles is close to that of their ayerage length, showing no true periodicity". In- graham. Mark H., "0n Professor H. L.Moore's Mathematical Analysis of the Business Cycle", June 1923, Vol. XVIII p. 765. **** Cf. Crum and Patton, 10c. cit., pp. 381-383: Also Professor Crum's article in the Handbook of Mathematical Statistids ,especially, . pp.l74—176. ***** Cf. Professor Wesley C.Mitche11's discussion in Business Cycles_, pp.259-261, and the reference there cited. 193 only to lgng series) to establish periodicity of the business cycle is implied in the hypothesis advanced by Dr. Mills "that the dura~ tion of business cycles in a given country is a function of the stage of industrial development which that country has attained.* The obvious inference from this hypothesis is that periodicity is, at best, characteristic of the business cycle only for relatively short periods, that is only for those years in a country's history which represent a relatively homogeneous, phase of industrial de- veIOpment. The pertinence of this consideration when applied to the fluctuation of individual time series is obvious. The character of the cyclical fluctuations in price series for such products as hogs, beef-cattle, and horses is obviously conditioned by changes of a structural sort in the economic order. The hog cycle, for example, was longer in 1870 than at present. and pronounced mod— ification of the horse cycle is new apparently under way. Observed fluctuations in a series are to be regarded as the net result of numerous interacting and mutually conditioning forces. But the interaction of these forces, and hence their net effect, is con- ditioned by the structural characteristics of the economic order at a given time. £95** One of the pitfalls of research with time - series data is the matter of lag. First of all, its presence must be actually demon- , stratod. Second is the task of measuring the amount of it. Lag is defined in the dictionary as the "falling behind or retardation of one phenomenon with respect to another to which it is closely related." While this definition comes from the use of the term in the physical sciences, it is equally applicable to economic terminology. Lag as thus defined may exist between phenomenon whose relationship is eith r qualitative or quantitative. The mere presence or absence of one phenomenon may be dependent upon the presence or absence of another at some preceding tine, or the quan- titative measurement of one may be dependent on the giantitative measurement of the other at some preceding time. The economist first of all is concerned with discovering whether the presence of one phenomenon is coincident with or lagging behind the presence or absence of another. In the more recent studies, however, em- phasis is being given to the measuring of quantitative relationships. and the following discussion is made primarily with those studies in mind. ' * Mills, Frederick 0., "An Hypothesis Concerning the Duration of Business Cycles", Dee. 1926, Vol. XXI, p. 447. ** Contributed mostly by Mr. Felt. M ‘v 194 1-' /~/ To say that one phenomenon lags behind another assumes a causal relatidnship or path of influence between the phenomenon. These pathh of influence vary from the simple d.irect cmuw -..nd- effect relationships to complexities beyond exclanation. Thus we find thot Variations in the weather from year to year produce varia— tion: in the _yield of potatoes and. finn.lly variations in potato prices. The nath of influence e"1s*11’ Jet een ootato.yie1ds end . price is direct and reletively simile. Far hermore. the relation-‘ Shiv) aresents no difficult -1roblem of lag determination. A fluctua- tion in potato prices, however, thru its effects on profitableness of notato iroduction may affect subsequent potn,to production, maybe for‘ mtly one year, may -e for several yesrs _oossibly in one the first ”' ' ye.ar .1nd in the onposite the next. The n.th of influence in this 6 .se is less simple and the problem of lag determination is ves9 come let. Again, given fluctuations in the weather may produce changes in the prices of two different croos, on one of them more delayed and ,rolonged effects, amounting to n 1::. 9 he price of influence be- treen the tn 0 rel .ted orices is OJ"lOlSly indirect in this case. In this case, the nature of the caussl relationship is known: but often the common generating factor may be unknown, or else it may appear that the path influence is direct. ‘ It is not safe to conclude that the changes which appear in one phenomenon are direct effects oi the changes in another, even though the former may be predicted with re.1sonable accuracy from preceding 'Ch-tnses in the latter. This does not men n, however. that these relan tio=.1shi;s are not Worthy of study. But while it is sometimes necessary 1;to deal with ligged rele tionships betv.een Series for which the oaths of influence are not known, these instances “re likely to be dangerous for forecasting purposes. ' Leg is inevitable between econoznic phenomenon. particularly in case the path of influence :ivin he relationship is one of direct c muse 1nd effect. Fen 1f "my econo.ic adjustments c1n be made instan— taneously, is time is required for the social orginism to act; One ~1ine oi ‘uusiness- nay become suddenlv profitable; yet before men and. c.Tjital are_ "ttrected to its orofitibleness trust become. known and decisions must he made. These huno.n processes all require time.' lirs'i'r there are nhvsicil 1nd biolo: ical factors limiting the Spflfifl 0f .adjnstmentsc s?1ctories-1ey need to be constructed or live- lfiufik raised before the final product will as forthcoming. _ 1 I ' I lbile some lag is inevitable between most 'eoounnso ghcnwmnnnu' the time lumen-271:3 uncertain. Any thing which will speed up abuse; 'mnts will tend to shorten this lat: period and. anything Emmiimg men adjustments lengthen the lag period. Thus. in attempts to measure lag, vit is important to keep the causes of the lag in mind and their effect on the_rate of readjustment. In some cases the problem of lag involves determining the time of the reaponse to a single event or a succession of discrete 'events. The effect of tariff chenies on the prices in an importing 195 country may serve as *n illustration of discrete relationships. The‘ crop and price rel tionshio also illust'rntes it An example of the continuous type of 3.01nti0n5hip is the dairy-product and price relaé tionship. The rainfall-yield relationship is Continuous during the growing season ReletionShip during: such a period may he in the . same or opposite directions - Thus, it may well be that heavy rain~ fall is favorable_ to crop condition durin: port of the period and un- favorable at othe.r tiaes vuthin the -period. . It is necessary to reco nizc an important type of the con— tinous type of reletionshio which may he desisnated as cumulative. Continuous relationships frequentl.y aieretc in the same direction . and in such a manner that the cwnuletive effect of one series fore— shadows or causes the changes in another. To cite a simple illustra- . tion. the daily balances of ones bank account is simply the cumulative of the net deposits during all precedin; time. Other relationships. ‘ not so exact. tend to operate in like manner. For instance, Xarsten finds that the price-leVel of securities shows the cumulative effects of the flow of money (into the market out of‘ business, or out of the market into business). Lag between two related series may exist fer any one or more of the numerous types of variations found in the. time series, the .short- time erratic variations the seas onel v.1riations, the waveolike cyclical variations the episodic varia.tions due to wars. panics. etc.. and the long-time variations or trends. Connarisons of daily changes in the price of a commodity in different markets or p.t different steos in the_ . market channel very commonly show tend.encies for lag.‘ Thus the price of wheat at country elevators is likely to lag one dry or more behind the terminal market price. Much of course do acnds on the method and frequency of obtaining market informa.tion and on the nature of local competition. There appears to be some misunderst~z.dinr as to the use of the term lag where two role] ed series. are being comuared hy alotting. or by correlating. This misunderstandin: results from the fact that ,the term is used. first to describe the sequence of events between ’ related series and second to descri“e the process of adjustinr- -the ‘series in comparisons by plotting or corrrelntinr. To illustrate, let it be assumed that fluctuatidns in series A cans e fluctultions 'in series B, the latter following the former 1W some definite period. say three months. In this instance probably 11 would a~ree thgt series B lagged behind Series K. But in cmse the t .0 series er plotted or correlated so as to allow'for this leg, which.series is "lagged" in the.plotting or correlatino:nrocesai The most logical answer to this queStion, and the one‘which'seems to agree with the" terminology of most rtiters, is that the A series is lagged. It is this series which is moved to a later -p.=riod of time- so as to ' compare it with theB 1 Beeries.‘ Since 3, as to sequence of events, naturally lags behind A by three months, the letter is legged an equal amount to bring the two series into_juxtaposition. Since the 3 series is the result or the dependent variable, it is the factor which is to he forecasted and. must retain its original position as related to time.. . . The importance of qualit. tive- maiysis 1n prohdems of lag cannot be overstated Given a three-yer r cycle in which two series . are opoosite each other, so far as the mathematics of it is concerned it can be equally well said that A affects B with a lag of eighteen months, or that B affects 1 with the s me 1a;:. Another excellent example is Figure 4. riven earlier. Here atricultural prosperity is made to appear to 19.;( behind business prosperity by the simple , procedure of assumin; a lap of 3 to 4 years Suppose we were tckmake a SimPler and more rational assumption that for the most part agri~ culture and business ought to pr05per to; ether That would mean. drawing -perpendicular rather than slanting- lines. It would then apeear that business and aariculture have been good and had together almost two-thirds of the time. A Careful study of the occasions ’when they have not been, as in 1905-8, 1914, and 1922-4. would then nodou‘ot reveal some significant further interrelationships. In the same two series, the causalrelationship mi:ht be made to run in the 1 opposite direction by the simple device oftslanting hhe lines the opposite direction. In fact, a conclusion1closely related to this was the first one reached in this study. (See the J. F. E.. July, .1925, p. 361—71 and p.479-85 in October number ) Coupled with the foregoing are the inaccuracies already pointed out growing out of the use of moving averages. . Following are some of the methods of measuring-lag that are in use: a ‘ 'l. Graphic p See Day; p. 324-7. ' 2. Simple correlation of the two series to seenhich lag gives the highest coeffici;nt. See Rictz, ppgleo- 65; Crum and ,Patten pp. 364-370; 2dills, pp. 420.430, 3. multiple correla.tion of tvzo series and major other factors ' affecting the larc;inr series, See Ennherg's "Industrial Prosperity and the Farmer" for an example of this; also re- view of same in June 1928 Journal Am. Stat. Assn. 45 More mathematicp.l methods are the quadrature method used by- Kersten. and thefactor system used by Zinn. , It should not be forgotten at any state of such study that the mathematical 01 mechanical processes which have been described d; not in themselves explain any of the trends or lags which they bring to light. They are simply mechanical methods which facilitate anal-g ysis and may be applied in discovering or describing the behavior of . the series. In studying each series, the apoarent reasons for the - lags in question must he kept in mind throughout the process. In each case, somewhat different interpretation will .30 applied to the lags discovered depending on the nature of the series and the relationships whiCh seem to exist between them and related facts, KL flffifih r—‘jfi Ht“? W‘yTnnflr- Hqflmfi Bulimia“ all»... Lilli-aw lg}, Lady-ll ."l Research Method and Procedure in Agricultural Economics A Publicationpf the Advisory Committee on Economics and Social Research in Agriculture of The Social Science Research Council VOLUME TWO AUGUST 1928 The material in this volume Was prepared under the direction of the AdviSOry Comitteeon Economic and Social . > Research in Agriculture of the Social Science Research ._ ‘~' Council. The constituent orgmizatione of the Social ’ ~~ \ Science Research Council are as follows: American Economic Association American Political Science Aesociation American Sociological Society American Statistical Association American Psychological Association America; Anthropological Association AmeriCan Historical Aesociaticn The members of the Advisory Committee on Social and Economic Research in Agriculture are: H. C. Taylor, Northwestern University, Chairman John D. Black, Harvard University ' Joseph S. Davis, Food Research Institute C. J. Galpin, Bureau of Agricultural Economics L. 0. Gray, Bureau of Agricultural Economics . E. G. Nourse, Institute of Economics .1 r G. F. warren, Cornell University TABLE OF CONTENTS Vol. Two Part II. Methods (Continued) g. Analysis of Data (continued) (3) Association Analysis, p. 197 (4) Correlation analysis, p. 219 Elementary analysis-cross section, p.219 Elementary analysis-time series, p. 232 Variables to be included, p. 232 Statement of variables, p. 233 ' Choice of method, p. 23’s Significance of results as description of data, p. 241 Effect of/errors in data, p. 249 correlation as description of geographic data,p.258 Multiple correlation mechanics, p. 258 ' h. Inference fromrfESults, p'271. (1) Statistical'inference, p. 271 (2) Inductive inference, p. 275 (3) Inference‘from correlation in time series, p. 285 (4) Predicting from correlatiOn results, p 288 ,B. The Method of Analogy, p. 298 C, The Case Method, p. 300 D. The Informal Statistical Method, p. 309 E. The Experimental Mbthod, p. 314 Part III The Research Approach, p. 323 A. Cross-section — non geographical. p. 324. B. Cross-section - Geographical, p. 329 C. Time sequence - Historical, p. 341 Part IV The Presentation and Utilization of Results, p. 352 A. The use of tables, p. 352 B. Graphic presentation, p. 360 0. Preparing the written report, p. 383 D. Utilizing the product of research, p. 387 (over) Part V Organization of Agricultural Economics Research, p.408 Viewpoints in economic research, p. 403 The localization of research problems, p. 405 The localization of_researdh facilities, p. 406- Research policy ana program, p. 408 The classification'of'roseardh problems, p. 414 Research outlines by fields, p. 415 Relationships between research in economics and in other subjects,4l7 Relation of research to other functions,‘p. 423 Administrative arrangements relative to other functions, p. 426 ' (a) Research in related departments, p. 426 (b) Research dnd related functions, p. 428 (c) Research in one department, p. 434 Accrediting results, p. 435 ' . ' ' Relationships to resaarch in other institutions, p. 436 Relations with nonrcoordinate.agenciesg_p. 487 Cooperative procedure, p. 438 - ‘ OVerhead organization as affecting research is in agricultural economics: P3 452 (1) Scheme of organization, p. 452 (2) Allocation_of funds to economic research, p. 456 Adnflnistration, p. 462 (1) Selection of research staff, p. 462 (2) Provision for training of graduate - .studenta, p. 465 i (3) Sabbatical leave, p. 465 (4) worm of service, p.,466 (5) Division of time between teaching and research, 9. 468 R E S E A R C H M E T H 049 A N D P R 0 G E D U R E IN AGRICUQTURA; ECONOMICS V 0 L U‘M E VT W O (3) AssocmxonimmsIStV The analysis of data by methods-of aseociation offers a log- ical approach to an explanation of correlation analysis. Most workers in the field of agricultural economic research are familiar with meth- ods of correlation; but many are unaware-that by means of the cross classification of data and an analysis of the percentage of cases which occur in various combinations, it is'possible to discover and evaluate aesociations of either variables or attributes without resort— ing to correlation methods. It will be the aim of this section to present the fundamental principles of association analysis, and to follow, with illustrations which show the analogy between the results of association and-those of correlation. v' The fundamental considerations of aseociation analysis may be'emplained better from a simple illustration of hypothetical data than from actual data. For this purpose, let us assume that records are aVailablerfor-twenty farms in a given district which show whether each is an owner or tenant, and also whether each has or has not an automobile. If these data were tabulated according to both character— istics, a cross classifiCation table such as the following may also be asSumed: ' ‘ Automobile Ownership Tenure Number who do Number who : of farmers : not have autos : have autos : Total : Ab : AB : A Owner :: 9 E 6 : l5 : ab : a3 : a Egnant : 3 : 2 : 5 : b : B : N Total : 12 g 8 : 20 In these illustrations, capital letters, A, B, C, will be used to denote positive characteristics. The opposite case, where the characteristic is negative or lacking will be designated by the corresponding small letters a, b, c. In the hypothetiCal data cited, owners will be given the Symbol A, tenants (non—owners), the symbol a. Those who have automobiles will be designated by B, those who do not by b. Two of these symbols used together will designate the combina- tion of characteristics which is accomplished by cross—classification; 13, the number of farmers who are owners and have automobiles; Ab, the number of owners who do not have automobiles; a3, those who are tenants who have automobiles, and ab, those who are tenants who do not have automobiles. Since the total number of owners (A) WOuld be made up of the owners who have automobiles plus those owners who do not have them, ' borothea Kittredgp, University of Minnesota. (197) 198 AB plus Ab would be equal to A. Likewise, theinumbers represented by ab plus a3 would equal a; AB-+ aB - B; Ab 1 ab = b, These varimus_ symbols have been entered in the upper left—hand corner of each com— partment of the cross—classification table, to assist in the analysis. Any number of cases which can be designated by a single letter is known as a first-order frequency and those designated by a combination of two letters as a second-order‘frequency. The test for association in any cross-classification table like the above is made by a comparison of percentages or proportions of the cases found in different combinations. Formulas are unnecessary if one thinks readily in terms of percentages; Otherwise, simple formulas for various percentages, expreSSed in the foregoing symbols, assist materially in carrying through an association analysis. ‘ let us assume that our interest in the hypothetical case of the twenty farmers, is to ascertain whether owners are more likely to have automobiles than are tenants. Of the total of fifteen owners, six or forty per cent had autos; of the total of five tenants, two or forty per cent had autos. This is equivalent to saying that the number of owners having autos was to the total number of owners, as the number of tenants haying autos was to the total number of tenants. 'The pro- pertion in terms of symbols would be ABSA: :aB:a. Of the total of fifteen owners, nine or sixty per cent, did not have autos; of the total of five tenants, three or sixty per cent did not have autos. The number of owners not having cars was to the total number of owners, as the number of tenants not having cars was to the total number of ten— ants (Ab:A::ab:a). In the two tenure groups, there is exact agreement in the percentages (forty per cent had autos, sixty per cent did not). These statements are an indication of total lack of association between the tenure of these farmers and automobile ownership,—— a condition which Yule has called "complete independence". Inequality in the per— centages according to auto ownership, between tenure groups, would be evidence of association. Since the degree of association present in all cross—classi— fications of this general character must be Judged by the manner in which they differ from the case of complete independence, the criteria of complete independence must be developed at the outset. The first of these criteria is that which has already been observed from the proportions, that the percentage of automobiles among owners of farms is equal to the percentage of automobiles among tenants, in which case, Q._a_.§ <§___.g; 4o%=4o%> A a ( 15 5 In the same way, the percentage of farmers without autos among the group of embers is equal to the percentage without autos among the group of tenants; . $3.933; aw: 60%=60% A a 15 5 ~This criterion of independence mdght have been illustrated equally well if the problem had been considered from”another-angle. 199 If the interest had been in “.scertnining fr0m the standpoint of quto- mobile onnershi3 the 3ro3ortions which vere owners or tenants the oasic 'groups would have Been B and b and the following 3ercenteges would have been examined: 13:42; '6=_9__;75%-_= 75% B b 812 ea - g3; g3 §_; 25% = 25% B b _ 8 12‘ . These qualities show, respectively, that the percentage of farm owners among those Who have autos is equal to that of the owners among those who do not have autos, and the-t the percentage of tenants among those the have autos is equal to that of the tenants among those who do not have autos. A situation of couplets independence is found in an analysis made from this viewooint as well as from the one first 3ro;osed Alt11ough the analysis for 1ssocietion may be made on either the- basis of rows or columns, usually the st tenent of the problem in— dicates which is the 10 T1031 fonn for a given . ase. Except for purpose of illustration, only the percentages on the Ac and a universes would have been cor puted in this case. ' The first criterion of conjlote independence between the two main characteristics of a cross-c-1.9ssificetion is, therefore. estabe _ * ~lished if the 'following percentefes ire eQual: - 12:11 seas: A n ' _ B b or ' 11.21 11.11 A ‘ a ' B b .The second criterion of comnlete independence may take a variety of ferns, Thus far the -percenteges have been computed on ’the b1sis of the A. a, B, or b uni ivorses, but the s—econd criterion involves also the greater univ rse represented by the total of N, -or 20 in the hy3othetical illustration. If no association existed - between ovmership of farm and automobile ownership, then we Ihould; expect the number of owners with sutos to bear the Same relationship to the total number of ox.ers .13 the total number having autos does to the total number of fa ere, or ex3res ed in symbols AB: A::B: N. In the illustration this "ould be 6:15::8:20, which indicates again th1t no association is present This proportion is equally. true when expressed- in any of the following forms:: .- 1939.3: ' L331) 533.222.; g-Q; .1- N 13' N)’ N' 11 NN The ebove proportion in whichever of the forms it is ex- pressed will form an equality only' in the event of comjlete indepen- dence in the data. Ineqaulities exhibited in the proportions are 200 indications of associetion. Conclusions from the first'two of the four statements of the fireportion (1.6. g; _ 3 and fig _ %) are not ' ' ' "A N B ‘ as reliable when applied to data as are percentages of the tyre £92.83. OTABJAb A a B b That the inequality between a statement of the form in- volving é§_end A2 would furnish a better criterion of association . B b ' . 'than the more general form involving AB and A is true'especially in 3 F a case where the majority of the oojects or individuals in the un- iverse are B's; i. e., if g a; TOtChGS unity, A; will necessarily . N 3 B iigroach A even though the difference between Ag'and Ag is consid— N A - B b ' able."I The third statement of the proportion, A3‘u‘A.E. is exceed— . ‘ N ingly useful, and is frequently used to determine the nature of the aesociatiOns. A3 equals An B. in the case of complete independence, N . as 6 . 15 x B in the above illustration. If AB were greater than.the 20 ‘ fraction-gézgfi-thore would be evidence of positive association between A and B. nr if AB were less than the fr1ction, negative association. The lest of these four statements, an. A T. gives the rule that if - '.\T NN , A and B are independent, the aroportion of AB's in the universe is equal to the proportion of 's multiwlied By the 3roportion bf B's. The chief advantage of the stn tements given under the first criterion over those listed under the first criterion is that they _ give expressions for a frequency of second order in terms of the freo quencies of the first order and the whole number of observations,‘ while the statements of the first criterion do not. The criterion of independence ney'assume a third form, which involves only second-order frequencies, and enables_one to recognize almost at a glance whether or‘not the two attributes are independent, and if not, whether the association is jositive or negative. , we have developed the stetement under the second criterion that A3 a ‘ B' in the event of complete indefiendence. A corresponding -fi_ ' ' . Statement, for the ah's could he'develo?ed along exactly the Same lines, namely. ab; a::b:1, or ab . a.b vif these two equations are multiplied . .1 efi— ,. . H , 1 ‘ . -. * Yule-~Intr3duction to the Theory of Statistigs, n. 31. 201 together {in}: A___ B x ab = 152]) thc‘resul-t will be- (AB) (.15) = .1,Ba.b _ T‘ - ' . ' . N2 In the sane way that under complete independence it has been shown that AB: A. B. and ab = 2;; ,-it can be shown that Ab - ALE and iv N ‘ ' N a3 = a.B‘ If these last two are multiplied together, the result will iii. . be (Ab). (a3) - A B.a.b ”TE—”t we have, therefore, in the case of complete independence. two statements, (AB) (ab) = A.B.a.b. < 1; > ’Ab - aB— : '.a.b. A~BN Therefore A39 ab = Ab;éB. since each is equal to the same value. In the case oflpositive association AB.a .fAb,33, and in negative associa- tidn AB. ab «b aB. If one has all 0 the second—order frequencies, this third criterion is perhaps the simplest one to test the associa- tion, since the cross multiplication of diagonal values gives the re— sult almost at a glance. One needs to keep carefully in mind just what the ass.ociar ' tion analySis by means of the percontages stated in these criteria 'actually means.* It is not concerned with the fact that some A's are 'B's, but with the fact that the A's which are also 3‘8 (AB'S) exceeds the number which would have been expected if A and B had been independ- ent. It-is in reality a comparison between the number of, say, AB'S actually found in a cross classification, with the number theoretic— ally exnected to be found in that category if comrlete independence existed. If we assume, for a moment, a cross classification which does not show complete independence, it would be found that AB = A.B ' N The number of '“'s to be expected under. complete independence, which is usually deSignated as (AB)o, would be equal to the fraction“ A. B , (Since(330):A:iB:N) and a comparison of the value (AB)o with the actual value of AB would indicate the nature of the association. This is what is really accomplished in the criterion which tests whether A3 = A B. _~__ N ~ Likewise when A and B are said to be negatively associated or disaSsociated, it is not meant that no A‘s are B's but that the number of A's which are B‘s (AB's) is less than (AB)o, the‘number to as ex ected if A and B had been i;1dependent. Aseociation cannot be inferred from the mere fact that some A's are B's however great that proportion. *Yule,~- Introduction to Theory of Statistics, 3. 29. ' These three criteria suggest the way in which association" may be tested by comoarisons'of percentages or proportions and also sets blished fermulas which enable one to tell whether association is oositive or negative. , One becomes much better acquainted with the de.te. by analyzing such nercentages than by comnuting at once some ainereitly final measure, or coefficient which, too often is the o;11y part of the analysis to receive any attention. uOT'SVGI‘. a need h~s oeen felt for a summary measure of the aesocie.tion exhibited in see classification tables._ Several have been proposed and have fee: the subject of disagreement and controversy. The coefficient :0 osed by Yule, which he states may not be the most advantageous in all cases; is based on the difference in the cross multiplica- tions of the second order frequencies of a table divided by their sum; It.i8 as follows: Q‘ (AB)§ab) - (Ab) (531. ' ‘ ( J)(a0) + (~b) (as) It has been shown above that if complete independence existS (AB)(ab) 3 (Ab) (a3); so that the numer.ator of this fraction in this event is necessarily O. Th1: the coefficient of association in our illustration would be 36 x a; $9 x.ag = 5‘ In the casa of 29? feet 6 x 9 x 9 ' associrtion where all A's were B's and all a's b's. the second part of both numerator and denominator would disappear completely, leavi:.g . _031Y %é§%%§§% or 1. the coefficient of perfect positive association. ab ' ‘ In Case all A's were b's and ell P's, 3's. then the first nart of both numerator and denominator would disappear and the result would be —1, perfect negative association. The values of 1 and -1. may. however, be secured when a relation of less than complete aseocia- tion occurs. One example. will suffice to illuStrate this. If all A's were B's leaving Ab - 0, out some of the a's were B's and some b's then the last half of the numerator and denominator would die— appear and a measure of one s ill result. ‘ The-foregoing analysis of a hyjothetical case where the classification is of attributes rather than variables. illustrates that.association is a method of analysis available where quantita-“ tive measures are not possible. Ordinary correlation ans-lysis does' not lend itself to such data. For these situations, if no others. the research worker in agricultural economics should be familiar with association analysis. It is'esnecially useful in studies of sociological data. Economic date. however, are so largely ex— pressed in quantitative form, that a method which is to be widely used .1ust have an apnlicetion to statistics of variables as well he of :ttiihutes. That this is tLuc of the method of association will a.)peor in actual illustrations. The fundament'l ~111ciwles of comolete independence havingfiflunestzblishcd on e.hyvothotical case, it is now possible to take an actual set of data and a1alvze it by methods of associa- tion. After this has been done. the analysis will be extended into the field of partial assocL tion 111 tliis presentation it will he . the aim. wherever possi .le, to give the analogous results o'Dtainec - when correlation analysis is apolied to the same data. . The dat 203 selected for this illustration are quantitative measurements which re- late to eighty;tw0'counties in the state of Minnesota. Three Veri- ables have been chosen, the average value of land and buildings in each county as reported by.the 1925 agriCultural census; an index of produc- tivity which was constructed from census data on crop yields; and the percentage of land in woodland reported by the census. The value of land and buildings varied from-$25;09 to $149.97, with an average of $78.61 the productivity index from 12,91 to 26.73 with an average of 19.50; and the percentage‘in woodland from 0.5 per cent to 58,0, with' an average of 17.1 per cent. If these data were to be correlated they, might be treated as unarouped series, or by the use of class intervals for each variable, they might be reduced to'a double frequency correla- tion‘table, As the size of class interval was increased the number of classes would decrease,and if the process were carried far enough it would be possible to condense each series into two class intervals_ and have a double cross classification table similar to the one used in illustrating the fundamental principles of association. Any value can be selected as the dividing point between the_two classes of'any series, but "average and above" and "below average" are frequently chosen, and will be used.in the following illustrations. Let the classification average or above in value of land and buildings be designated by A, below average value by a; average'or above in the productivity index,B, below average productivity, b; average or above in percentage of woodland, C, below average percentage, C. Following is the double cross classification of the value of land and buildings with the pneductivihy indexes. ‘ - ' Product ivity_, Index Vfilue of land : ' and building : Below Average: Average and above: Total . : Average and :Ab ' . :AB ' ‘:A ; = above ' 2 ' 12, .: 30 ' : ‘ 42 : Below -1' :ab 1 . :aB ‘ . " :a : average ' . z 25 : 15 : -4O . : ‘ :b :B i ‘ . ‘ ' :N : -Total :‘ 37 2‘ 45 : 82 I Where all second-order frequencies are available, probably the easiest test for association would be to find out whether AB.ab is equal to, greater'than, or less than Ab.aB. 'In this case we have 30 x 25:) 12 x 15, indicating positive aesociatiOn between value of land and buildings and productivity. - ' Although this table could be analyzed either on the basis' of rows or columns, the question to be answered is assumed to be the relationship which productivity bears to value of land and buildings, as'in the following: , 1 Percentage of counties above average productivity among those abovev average value: fig a go . 71.4. A 42 204 \ Percentage of counties below average productivity among those below average value: ab = §§ 3 52.5, a 40 ‘ Percentage of counties below average productivity among those above' average value: Ah 3 12 - 28 5 A 42 Percentage of counties above average productivity among those below average value: fig . lg . 37 5 a 40 These percentages seem to point to a considerable degree of positive association. If complete independence existed, éfi - g3, A a but in this case the corresponding values are 71. 4) 37. 5. Similar results are obtained if the other tests of complete independence are examined. * The coefficient of association computed for these two vari- ables by the formula AB. ab-Ab. a3 results in a value of .612. L co- AB. abtAb. a3 efficient of correlation computed for the so .me variables treated as ungrouped series, resulted in r a 1.457 The following associates the percentage of woodland in each of these eighty—two counties average value of land and buildings: Percentage of Woodland Value of Land : Below : Average and : : and Buildings : Awerage : Above : Total : Average :Ac :AC :1 : apd above : 35 : 6 : 42 : Below :ac . :30 :a : Aggrage : 12 : . 28 : 4O : :c :C :N' : Total : 48 : 34 : 82 * When one is dealing with a sample in which the conditions of simple sampling are met, a question will arise at this point as to how great a divergence between two percentages, such as gg and a3, A a ‘ could be accounted for merely by fluctuations of sampling. In this event, perhaps the best test would be to compare the observed differ- encebetween the two percentages with the standard error of the per— centage difference resulting from a solution of the equatione. . +W1QK (Yule p. 269. ) Unless the observed differen’g‘e n n, exceed some three times the standard error, it is possible for the ,association to have arisen solely by the chances of simple sampling; Since each observation of the data of the illustration may have re— sulted from a different set of forces, the theory of sampling is not applicable; but if it were, the standard error of sampling would be 10.4, to be compared with an observed difference of 33.9 per cent. 205 - Analyzing this Situation as with the previous variable, we can see at a glance that negative association is present: or ‘ AC. acZLAc. aC 6. 12\<36. 28. The percentages which may be examined for aesocintion are: Percentage of counties below average in woodland among those above average value: Ac = 86 a 85. 7 ~ A Percentage above average in woodland among those below average value: =88: 70..o a 40 -Percentage above average in woodland among th_03e above average value: A0 a 6 -= 14 3 A 42 . Tercentage. below average _in woodland among t.hose below average value: ac :fE: 30..O . . a 40 ' ' These percentages, indicate a considerable degree of negative "association. ' The coefficient of association in this case computed by by the formula previously used is--.867; The coefficient of correlar tion between these two variables computed from an ungrouped series of the eighty-two counties is - .638. If the analysis had been carried to this point by correla- tion methods, the next step would doubtless be to aeoertain the rela— tionship between productivity and percentage of woodland.‘ The -same step may be taken by aSSociation analysis. A cross-classification of these two variables results in the following table:' Percentage of Woodland Productivity Below : Average and Index : Average : Above : Total : "AVerage and :Bc' ‘~ 1 :BC' 7 ' éB : apnea ' : 26 t " 19' : 45 : Below' ‘:bc- ' :bC ' :b 3 lNerage : 22 : =15' : 87 : :c :C :N : ”Total. .: in, A . 48 : 34 :. 82 : we- see that BC. be is slightly larger than Bo. bC, indicating a positive sign for the coefficient of association.‘ The percentages to be examined are: . - 5' " '- 'Percéntage of counties above- average woodland among those above average productivity: BC: 19 = 42 2 ' B 45 'Percentage of counties below average woodland _among those below average productivity: bc : 22 :' 59 5 - . 1 b- .'.,. -¢- ‘ .., ................i .4 --.‘». ., .. - u m...,... .. 206 Percentagewof'counties below average woodland_among those above average productivity: 39.. - __2_6= 57‘s, 3 45 Percentage of counties above average woodland among those below average productivity: b__C_-l__5_ a 40 5' b 37 If the tests for complete independence 3g - 29, gag g9 . pg 3' b B b are applied it develops that while 42. 2 #=40. 5, and 57. 8 %r59. 5, yet the difference is slight. The number of cases to be expected under absolute independence, (EC)° (bC)o and (bc)o, is in any case so nearly equal to the actual frequencies that the association must be neglig- ible. For instance (BC)0 _ 3.0. which is 18.7, while the number of N Ed's actually found was_19. The coefficient of association is found to be +.O34. The coefficient of correlation between these two vari- ables in the ungrouped series for eighty-two counties, is 1.183- Partial Three simple associations, one exhibiting Association. positive association. one negative, and one with practically no association, have been tested out by the methods of association analysis ' Values for the Coefficient of correlation in the same cases (computed for ungrouped series) have been cited for comparison. To continue the comparison, methods of partial association must be taken up along side of partial correla- tion. A partial correlation of these variables gives r AB.C a value of c...758 and r AB.b —.825. which shows that the net correlar tion between A and B, is increased positively When the effect of C is eliminated, and the net correlation between A and o is increased negatively when the effect of B is eliminated. (A, B, and C are used here as subscripts of the r, without distinction between cap— ital and small letters.) A partial association analysis can be carried out with these three factors which will result in substantially the same conclusions. . Such an analysis necessitates a triple cross classification, as in the following table: value of land and Productive‘Index' buildings, and per— : Below : Average and : contage of woodland : Average : above : Total : fiveragg value and above: A : : 3 . Average woodland and Echo - ‘lsc : 7' -‘AC ‘ above : #l : 5 : 6 z ' = :Abc 4 :ABc ~ :Ac : ‘Below average woodland : ll : 425 : 36 : - :Ab ‘ ' :AB ' :A : Total : - 1g__v : 30 : 42 : 207 Value of land and : Productive Index buildings, and per- - Below = Ayerage and i : centaaes of woodland : average 1 above : Total : W41 : W : bC :aBC :ao : Average woodland and : : : : above : 14 : 14 : 28 : :abe :aBe‘ :ae = Below average woodland: 11 : 1 : 12 : tab :aB :a :a Total : 25 : 15 : 4O : All values : -: : : Average woodland and : . S : : above :b0 15 :BC 19 :C 34 : Below average woodland:be 22 :3c 26 :c 48 : Total :b 37 :B 45 :N' 82 : The complete tally is made in the eight squares AbC, Abe, ABC, Abe, abC, abc, aBC, and aBe. The remaining compartments are merely totals derived from these Combinations. For example, the total of a?’ ABC rABC = AC, and of AbC-rAbe = Ab. . ?aria1 association will attempt an analysis of the associap tion of the A and B factors within the universe of C's, the counties above average in the percentage of woodland. In other words, the association of value of building and land with productivity will be examined when the effect of the percentage in woodland has been elimr inatfid. From the classification we can observe the proportion of 3's among the A‘s in those instances where, with re spect to average per— centage of woodland, no variation exists. » The number of counties which are above average in all three of the factors under considera- tion, designated by ABC, is 5. If no asSoeiation of 1's and 3‘s ,5 exists in the universe of 0's, the following preportion woould hold: ABC:AC = 30:0. This is, within the universe of counties above ay— erage percentage of woodland, the proportion_of counties above aver— age productivity among those above average value will be equal to the proportion of counties above average productivity in the universe under consideration (that of the C's). This equality will hold in case>there is absolute lack of association, so that a test of whether IQACQQBCZ will indicate whether the partial association is posi— tive, negative, or absent. The value of the fraction §AC2§PCZ is 208 equal to the number of cases to be expected. (ABC)O. under conditions of abaclute independence. In this case, the theoretical number is '3 and the actual 5, so that there is evidence or positive associa» tion of the A and 3 factors within the universe of 0's. The most 3convenient comparisons to be made from the proportion ABC: AC::BC:C are as follows: ABJIBC; and ABCEAC.) iAC ‘ C BC C ) ~ It is also well, in addition to making comparisons with the whole un~ :iverse of C's, to compare the proportion of the B's among the .i's in the universe of 0'3 with the proportion of B's among the a's in the :same universe, Ito see -if A30: agg, Iand similarly the proportion of ' AC aC . A's among the B's in the universe of C's with the proportion of A's '4 unong the b's in the sane universe ABC = Abe. ‘ . BC ”“ b0 .7 The association could be tested by any of these formulas, -Ibut the two of the latter type, for the reasons given under simple . association, are sometimes Slightly more reliable. Ii we determine .whether ABJ; (QC. the result iis 5>14I, again indicating positive :relationship of the factors A _and B nnder the analysis for partial . association. 4 In order to test out the degree of the partial association of value of land and buildinrs ,:with productivity when the effect of percentage of woodland is not operating, a seriesIOf percentages ’similar to the following may be examined: . - For the entire material: Percentage of counties above average value of land and buildings W” 82 , I, Percentage of thOSe above average value amoIng those ahove average pioductivity: gs = 30 = 55- ' B 45 For those above average woodland: Percentage above average Value: AC 3 6:13 C - 34 " Percentage of those above average value Iamong those above average productivity: ABC . é a 26I BC 19 I ‘ For those below average woodland: Percentage above average value: Ac 3 86: 75: c 48 ' Percentage of those above average value among those above average productivity: ABc . 25 . 95 Be 26 .209 When only those counties are considered.which are above average~per- centage of woodland. the association of A and B is shown in less degree than for the material as a whole.- The association of A and B however, is considerably more pronounced in the counties where the average percentage of woodland is below average, than in the case of the material as a whole. This result would Seem to he in accord with T A3 - #2457 and r AB.C.b-+.758 which accomplishes elimination o? the .0 factor from the series as a whole. Similar analysis oculd he carried on to Show the association of value of land and buildings with percentage of woodland when the effect of productivity was eliminated from the consideration. The statement of the proportion for the association of~A's and C's in the universe of B's which would he expected in the case of §$801Ute indef pendence is ABC:AB::CB:5. One test of association is whether mac 5 (ABHCB) ', 1' ' > B' In this case we have 5(30 3: 19 , or 5(13. indicating mesa- ’ ' i 45 . ' ' tive association in the partial analysis. Othei forms of comparison which the above proportion suggests are $39} with g3, and ago with as. _ AB ' ' :3 cs - "B Here also. for reaSons given under simple aseociation c6m§h parisons of the type égg with Egg showing a camparison of the prOPOr‘ . , ' 33' a3 ‘ - 4 tion of 0‘s among the Als in the universe 5r B's with the C's among the_a's in the same universe; might be somewhat better. This.compari- son §_Z;l$, shpws evidence of negative association in the partial an- ' 30_ 15 nlysis. .Inganalyzin: the degree of partial asSOciation between A and _ C, without Variations in B. the following‘percentages should be examinedg ' ' t For the entire material: Percentage of‘counties-above average value: g ="51. Percentage ahote.average value among those which were below average woodland: fig;g @Q»: 75. " \ ' ' ’ c .5 48“ ' ' 04. * That ABC is‘the proportion of C's among the A's in the universe of B‘s , CB 1 ' . f '~' .' . .. may be seen to he analogous to the former illustration of partial association if it is written CAB. No significance attaches to the order of the symbols in any class designated hut the proportions to be studied sometimes appear more clear ifwthe'order is altered. CAB ' _‘ n , p ‘ a . Ki— . _‘ , _ ,. . , seems to give a clearer statement of the proportion of C‘s among the A's within the universe of B's, than does.ABC. ' AB . w«ms .. --. '\ lbr thosa.above average productivity: Percentage\above'average value: £3 = §Q i 66 B 45/ Percentage of those above average value among those which*were below average WOOdland...AB_9_4-25 n 96 . Be 26 For those below average productivity: Percentage above average value: $3 = lg = 32 b 37 Percentage of those above average value among those which were below average percentage of woodland: Abe g ll = be 22 Here the negative association between counties above average value and those below average percentage of woodland is tested out with the B factor, or productivity index, held constant; In the group where only those above average productivity are considered, there is evidence of greater association between counties with the high values of land and buildings and low percentages of woodland, than is evidenced in the material as a whole. In the separate group of those counties below average productivity there is no evidenCe of increased association. In fact it is scarcely as high as in the material as a whole. When the data are all handled together by the method of partial correlation applied to the ungrouped data, the r AC. B; is found to be equal to -.825, that is, the net relationship between‘value of land and buildi;' ings and percentage of woodland is somewhat increased by eliminating the effect of productivity which is in agreement with the association analysis in the universe of B's. The r ACv it will be remembered, was -.638. By the methods of simple association, it'has therefore been possible to analyze the relationships between each two of three diff- erent variables with substantially the same conclusions as those which would have been found by the gross coefficients of correlation. BY methods of partial association, something has been demonstrated of the nature and degree of the net r gégtionship between each of two of the variables when the other/m eliminated. The elimination was accomplished by means of a cross classification which madeit pos— sible to study the relationships in two groups, in each of which the third variable was constant (1. 6. either above or below average). Although the conclusions are necessarily less specific than those of partial correlation, there seems to be general agreement between the conclusions from the two methods. Association analysis, however, has nothing to offer in lieu of the regression equation or multiple R ~ of the correlation analysis. ~The methods of association provide a means of analyzing data by the simple and easily understood methods of cross classifica- tion, and£a comparison of the percentages found in different combina- 211 tions, with those which would have been expected if no tendency toward association had been present, This type of aSSociation analysis, how— ever, permits of only two group headings for each variable, and when this is done, a considerable amount of detail is lost which can be taken into account in correlation analysis. In correlation analysis, all observations are measured by the product of their deviations from the two means and related to the square of the deviation from their respective means through the two standard deviations, as r: E(x.i2r NJXKy In association, all cases which fall in a given crose claSSification group have equal consideration'regardless of whether they are extreme measures or barely of a size to come within the classification. In general, this is an argument against association analysis. However, it eliminates the difficulty sometimes experienced in correlation an- alysis where one or two extreme cases produce a coefficient of signif— icant size when no general relationship exists at all. ' Association analysis may be accomplished in much less time than correlation analysis and may often be'employed as a preliminary step in considering the variables to be used in correlation analysis, even if it is not used as the final method, In fact, when scatter diagrams are made, association analysis may proceed at once from a count of the cases in each quadrant, or on any Other basis of class- ification desired. In the analytical study of data, such as is in- volved in association analysis, the research worker is likely to be— come more thoroughly familiar with the data than when the analysis is made along the more routine methods of correlation. Finally, the results of association analysis, which are expressed chiefly in the form of percentages or proportions, may be presented successfully to those who are unfamiliar with the technical phases of correlation, and to whom the results of a cOrrelation anal- ysis would mean mothing. ' Measures 0f Simple measures of association-apply only 922$i££§2§1.. to data in a 2 x 2 fold arrangement, where the classification has been made accord— ing to two characteristics, each of which is divided into two headings. The co— efficient of contingency has been devised for the analysis of data in manifold classifications which do not permit of numerical ex- pression. The method is apolied in Table I to an analysis-of the association which exists between geographic location and the rate of interest on farm mortgages. These data are reported in the 1920 census_for nine geographic divisions and seventeen interest classes of varying 81293. In order to permit the table to be more easiLL handled and at the same time allow a considerable amount of detail,=, the class intervals of interest rates have been condensedrinto-nine relasses. The number of farms reported has been expressed in hun- dreds. ' 4 - .212»: 4 Table 1;'~ number of farms in the united States on which rate of ' interest on mortgage debt nearrepcrted in the 1920 census, classified according to rate of interest and gGOgraphic divisionsr*' - .-(In hundreds)- : Interest rates :Less: 4.0: 5,0:.5.5: 6.0: 6.5: 7.0: 8.0:10.0: ‘ :than: to:: to: to: to: to: to: to:anh>:Total Geographic. : 4.0: 4.9: 5.4: 5,9: 6.4: 6.9: 7.9: 9.9:0ver:. DiviS' . : :‘ : :' ;,. :» : : : : . 1°“ 5A1.:"‘2:,A35A43‘.‘5:‘6,,A7fi‘ef’s : he".§ng1_and._ ____I;L:_ 1 : , 8:150:,35:.-274: s; ’7: 9: 1: 488 Middle Atlantic ' 32: 8 : 64:'440: 74: 518: _ 1: 1: l: 1: 1108 E.N.Centra1 B3: 13 : 134: 537: 200:1422: 42: 3 8: 30: 3: 2709 N.N. Central '34: 3 : 5 : 437: 363:1077: 11": 280: 278: 27: 2628 South Atlantic B5: 2 : 9: 39: 45: 565: 5: 60: 247: 27: 999 E.So.Central 135: 1: 48: 31: 41: 586: 4‘: 25: 315.: 51: 1052 West So.Centra1 B7: 15 : 10: 70: ~63: 238: 20: 114:.477: 273: 1280 Mountain 38: O : : ,19: 36: 135: _15: 109: 294: 113: 725 Pacific 139: 1: 6: 33: 33: 236: 24: 255: 106:. 9: 703 Total : 4 :: 298:1756: 890:5051: 227:ll79:1752:'505:11702 Correlation analysis of these data would be impossible, since a quantitative statement of geographic location is not available.' The quantitative measurements of interest rates as reported by the census also contain class intervals of varying sizes with unsatisfactory midpoints, which make this variable also poorly adapted to correla- tion methods. Contingency analysis does not.assume equal class in- tervals or consider that all frequencies within a class fall at the mid—value. ' ‘ The single meesxre of association, such as that which exists be— tween geographic location and interest rates in this case has its basis, as in simple association, in the difference between the number of cases found in each compartment of the classification and the numy her to be expected in that compartment if no association at all were present. The system of symobols need in the table is self-explanatory. The number of farm mortgages which carried an interest rate of "4 and under 5 per cent" located in the East North Central states would be designated by a combination of symbols, Ag 33. ' The frequency of 1 £2 33,'1n«tholtab1é.is 134. *Abridged table from 14th Census - Vol. V p. 491 213 . one column of the Table, A3, will be studied in detail in order to illustrate the principles involved. A total of 1,756 farms (OO omitted) were reported at the interest rate of 5 and under 5% per cent.lr,If these 1,756 cases had been distributed among the different geographical division in the proportion-which the total number reported for each geographic diVisiomywas of the total number for the United States, we should have the distribution expected under a c0ndition.of absolute independence. The expected frequency or independence value for any classification will be indicated by the usual designation with the addition of parentheses, and a zero subscript.' Under absolute in— dependence the value for (A5 311° could be-found on the basis of pro~ portion existing in the totals A2 Bl>o : 1756::488:11702, or if one prefers, it may be thought of as A3 Bl)O : 488::1756:11708. The latter statement is to the effect that if no relationship existed, the number at farms with interest rates of~5 and under 5%4percenttin New England would bear the same relation to the total number of farms mortgaged in New England that the total number reporting aerate of 5 and under 5% per cent did to the total number in the whole country. The arithmetic is the same whichever statement is taken;: 'The independence value for (15 Bl)o is 1756 x 488; and (A3 Eg)§ja;l756 x 1108_‘;H 11702 : , _ _ .' ~11702 . . Independence;values corresponding to the frequencies of each compartment in the original-table, have been similarly computed, and are shown in Table L1,; -- g - - ' . Table II.s- Independence values for the Frequencies of the Preceding Table.; : - :Interest rates : : 4:685:10: 5.0: 5.5: 6.0: 6.5: 7.0: 8.0:1020: : Ge°€TaPhiC :than: to: to: to: to: to: to!- to: aihzTotal: DiViSions' : 4~09 4:9: 5.4: 5:9:-6.4: 6.9: 7.9:‘9;9:over: : New England . z 2':, 12: 73: 37: 211: .9: 49:.“78: 21: 487: Middle Atlantic : 4 : 28: 166: 84: 478: 21: 112: 166: 48: 1107:. E- N. Central : 10 : 69: 407: 206:1169: 53: 373:" 406: 117: 2710,: W. N. Central' : 10:: 67: 394: 200:1134: 51:”265: 393: 113: 2527: South Atlantic : 4 : 25: 150: '76: 431: 9: 101: 150:"-",43:‘ '999: E.South Central : .4‘:x 27:.159:' 81: 458: 21: 107: I59: 46: loszkg: W. South Central_ : :5 : ’33:;192: ’97: 552: 25: 12%: 192:' 55: l“ .h Mountain : :8*:: 18: 109: 55: 313: ”14:” 73: 109: >_; 5: Pacific ..: :3: 18: 105: 53: 3031-714: 71: 105:; $01“. 702: ' a :‘1 : :‘” :L.: :_ g . j:, :75 , Totals* : :45 .:'297:1755,::'.§89;. 49;! Tin 53: 504:1;599; * Since this tablegiaighr9~d£stf§bfitioulof the actual valuee'bi: the preceding table, the tefihls”should theoretically agree:‘ The small differences which occur result tram rounding of! i . the independeneefivélues to the nearest whole number. The differences between the actual frequencies and the independence values for colmmmffie" ught together in Table. III in order that the differences may be more learly seen. The difference column shows larger numbers than would be expected in the New England group, the Middle Atlantic, the East North Central, and the West North Central groups. and the opposite in the five southern and western groups, under a condition of absolute indepen— dence. Table III — Independence values and actual frequencies compared for Column Three. Farm mortgages reported at 5 and under 5% per cent. Actual : Independence : No. : Value : Difference New England :A331 150 :(AgBi)o 73 =d31 #77 Middle Atlantic :Ang 440 :(A332)0 166 :d3 +274 E. N. Central :AsBz 537 :(A333)o 407 :dss +130 W. N. Central : 437 : 394 : 443 South Atlantic : 39 : 150 : —111 E. So. Central : 31 : 159 : ~138 W. So. Central : 7O : 192 : —122 Mountain : 19 z 109 : - 90 Pac ific : 33 : 105 : - 72 : 1756 : 1755 : The corresponding comparison for colunn 8, shown in Table II, indicate a reversal of the foregoing situation for the higher interest rates. Table IV. Corresponding analysis for Column Eight. Farm mortgages reported at 8 and under 10 per cent. Actual : Independence : No. : Value : Difference. New England :A331 9 :(A331)o 73 =d81 ‘54 Middle Atlantic :Ang 1 :(Aeflz)o 166 zdaz -155 E. N. Central :A833 30 :(1833)o 406 :aes -375 W. N. ’ tral L 275. x , 393 i -120 South antic : 247 : 150 : T97 ‘ E. So. Central : 315 : 159 : 4156 W. So. Central ,: 477 ,: 193 : +285 Mountain : 294 : 109 : +185 Pacific : 106 : 105 : + 1 A 1752 : 1753 : 215 The coefficient which measures the relationship between the two variables is based upon all such differences which exist between the actual number and the number eXpected. Since the differences (d's) are both positive and negative, these values are squared in order to eliminate al— gebraic signs. Each of these squared differences is then expressed as a. ratio of the corre3ponding independence Valu . If we think of the differ— ence between - AiBi- (A1Bi)o - c111 AlBg- (A132)° = dlg A233- (A333 )0 = (123 and of each one in a. similar fashion, we may conceive of the series con- tinuing indefinitely until we have AmBn) - (Amlan)o -.- dmn' In these gen- eral terms, the arithmetic steps described can be summarized in the follow- ' ing formula for what is known as the square contingency, designated by Me as x3 . _. ,_ j>32 -but Ithé 1,, N character is often used that it seems advisafile to connect it with the formulas and termin610;:y of Ithi-s nrtich It would thus be expressed, . . . ‘.;. . 217" The goefficient of contingency ordinarily stated in connection with is . . ~ ' -' ' , This can readily'be derived from Yulels formula C .” 2' by dividing both numerator and denominator by N,_ and substituting ¢ 3 for g2 ,. N ' ‘Many objections have been raised to the Coefficient of_con-V tingency. and much theoretical discussion has followed with little agreement; One'bbjection, however. is of sufficient practiCalzim- portance in the interpretation of coefficients of contingency.that it seems advisable to mention it briefly. This is the caution that two coefficients calculated from the Same data but dn-different systems of claSsification are not new}: with ea‘fhother“ . If a coefficient of contingency had been computed for the_same data.ae the above but with different grouping83‘say six intervals of interest" rates and six geographical groups,-the coefficient for this 6 x 6 . fold table would have tended to be less than the one compared for the 9 x 9 table.* 5' ., The coefficient of contingency‘of goegraphic location and interest rate on farm mortgages which was found to be .568 on a 9.x 9- fold classification, dould not have exceed a value Of .943 had the association been perfect. More groups would have raised iit; fewer would have lowered it, The greatest possible values are 'so low for Smaller tables that Yule considers it wise to restrict the use of the coefficient of Contingency to tables 5 x 5 or larger. Yule gives e table 'of the limiting values of the. coefficient of ' contingency with different numbers of groupe.** . The coefficient of contingency_is an imperfect measure of ' association between two characteristics. It use is not recommended for series where both characters permit of satisfactory quantitative measurement. In such cases, the-coefficient of correlation or core . relation ratio are more satisfactory; But; where one or both of the ‘ characters permit of only qualitative observations. or the quantitae tive measurements are unsatisfactory for.correlation, the coefficient of contingency offers a-means of measurement of association between ' the characters, ‘- ' « ’ It will be interesting to measure the association present in the above contingency table. if it-is reducedvin different ways to'3,X'2- fold form and the elementary analysis of simple associa- -tion applied. Yule's statement in'this regard is as follows: “it * See Yule for a demonstration of.thiS. ** p. 55' Q18 then becomes possible to trace the association between any one or more of the A's andsny one or more of the B 9, either in the universe at large or in universes limited by the omission of one or more of the A's, of the B's, or of both."* If the division is made on the basis of east of the Mississippi and west of it, and under and over 6 per cent. the coefficient of simple association is only 4.1944 The following table shows a different geographical division: Interest rate. Location : Under 6% : 6%iand above : Total 3 "West" (Mountain dnd:Ab :AB :A = Pacific divisions) : 132 : 1296 ' tags 3 "East" (all other) :ab :a3 :a z : 2856 : 7418 : 10274 : :b :B :N : Total : 2988 : 8714 : 11702 : The coefficient of association is +.582, and the significant percentages are as follows: Percentage of mortgages 6 per cent and above among these re- ported for the entire U. S. 3 a 8714 a 74.5, N 11702 Percentage of mortgages 6 per cent and above among those in the "East". fig n 7418 a 73.2, a 10274 Percentage of mortgages 6 per cent and above among those in the "West". Ag = 1296 . 90.0_ A 1428 . The division of the interest rate at six per cent for these cross-classifications was entirely arbitrary. Much higher degrees of association would doubtless be found if other interest classifica- tions were analyzed. A large number of other associations between geographic losetion and farm mortgage interest rates could be as— certained by voribus rc-groupings of the detailed contingency table into 2 x 2 fold classifications. The particular ones to be made wOuld depend entirely on the nature of the questions to be answered inia specific research problem. In the case of data qualitatively expressed, the coefficient cf contingency is oftentimes a convenient measure for a situation taken as a whole, but it should ordinarily be used as a supplement to, rather than as a substitute for, an analysis of association based on fundamental principles, I"Yuleylntroduction to the Theory of Statistics, p. 62. 219, ‘ In this section the objective will be to outline and explain the method of correlation analysis in terms of problems in agricultlral economics, and indicate the significance of the-results so far as the data analyzed are concerned. Correlation and regression coefficients \. are to be understood in this section as descriptions of the sets of -data analyzed. ' A following section will take up the subjects of in- ference and prediction. = .t Elementagy It was thought best to introduce the sub— methods of , ject of correlation analysis by discussing first Anal sis*' , the more elementary methéds and then-showing what can be accomplished by more refined methods. The ‘ ‘ . 'data used to illustrate this:nre hypothetical~ largely and relate to the value of the land of-SO farms located in one quadrant of a circle enclosing a city of five thousand. The analysis is therefore of the crossLsection type, local geographic factors being the variables. The objective is to account for the variations in the value of the land in these farms.~ The farms range in distance from town from a quarter mile to twenty miles, the modal distance being 17 miles. - The distribution is thus strongly skrewed so far as this variable is con- ' corned. The data taken for this farm include also the percentage of: - improved land and the type of soil roughly classified as "loam", "clay", and "sandy", these being the names commonly applied by farmers in the ‘ area. other information was obtained, but is included to a limited ex- tent in the present analysis. For example, six of the farms were on the shore of a small summer-resort lake. Some of the farms are upon main-traveled improved roads and others-on crossroads. .A method of_ana1ysis commonly followed until recently is ’ illustrated by Tables A, B and C following, namely, grouping the farms according to one independent variable at a time and computing the avers ‘ age values of the dependent variable for these groups. It would appear from this that all three of these factors are definitely correlated with land values. But the results are not dependable unleSS it can be de-.- monstrated, first, that the farms are equally improved and have the same distribution of soil types, as.woll as road frontage and all other pos— sible factors. at all the different distances from town; and scoond, that, enough.ferms are ineluded in the sample to give all these different fee; ' tors a chance to average out at all distances from town. Furthermore, such tables provide no measures of the amount of correlation. .Plotting. the group averages as in the solid line in Figure II suggests something a .as to the regularity of the covariations and their linearity. More -; farms would make possible more groups and give more detailed suggestion As to this regularity and linearity. .But there would be no definite proof or measure of either of them.» Comparing the ratios in the last. columns in the three tables would seem to give a measure of relative rates of regression; For example, one would say that apparently per-. pcentage of land improved had the greatest effect on land values and soil * Mr. G. C. Haas of the Bureau of Agricultural’Economics_assisted Somewhat with.this section. - . . -:.l- 220' type least. One could even fit a line or curve mathematically to these group ratios and reduce the comparison to definite terms. But the con— clusions would be valid only if the conditions of non-correlation and sufficient number of farms above described are fulfilled. The regres- sions for improved land may include some of the effect of distance from town if the two are correlated, and vice versa. The early reports of farm business surveys are replete with examples of this form of analysis. A 3 Groups- :Average :Avorage:Ratios:Groups :Average :Average:Ratios Percentage of:Per Cent:Ve1ue :of :Distance: Distance:Va1ue :of Improved Land: Improved: of Land: Values: to Makt.:to Town :of :Values : : : : : :land : 20—39.9 : 29 : 44 : 100 : O to4.9: 2.2 : 204 : 100 40.59.9 : 52 : 98 : 222 : 5 to9.9: 7.3 : 157 : 76 60.79.9 : 66 : 148 2 836 :10tol4.9: 12.5 : 133 : 65 '80-100.0 : 88 : 185 : 420 :15t019.9: 17,0 : 91 z 44 C :Soil Types:Number:Ayorage :Ratios: : :of :Value :of : : :Farms :eer Acre:Va1ue : : Loam : 19 : 175 : 100 : :Cley : 25 : 126 : 72 : :Sandy : 36 : 112 : 64 z ‘\ The next step taken in the progress of correlation analysis was to compute Pearsonian coefficients of correlation and regression. The results of such an analysis are given in A and B in Figure 1. Percent- age of improved land shows a coefficient of correlation of .82. and distance from town of -.77. The regression equation for percentage- of improved land is y = 2.46X; that is, each one per cent increase of improved land is accompanied by an increase of $2.46 in the value per acre. Comparing the lowest and highest groups in Table A. and the' same as represented in terms of ratios in the solid line in Figure II, would give y . 2.39X as the equation; comparing the lowest and the next to the highest would give y: 2. 91X as the equation. The true regres- sien'lies between these, but is nearer the first. Our formal correla- 1ation analysis has given us two definite mathematical expreSsions of 'what we saw in Table A and Figure II. One of these expressions the regression equation, is in terms of a straight line fitted to the data. If currilinearity is present, if one per cent in improved land adds less at the top than earlier, as is somewhat suggested by Figure II, then neither the correlation coefficient nor the regression equation .are accurate mathematical expressions of the data. The other math- ematical- expression, the correlation coefficient, lias taken account of the amount of scatter about the line of regression. Figure I shows considerable of such scatter. It frequently happens that tables such as A, B and 0 show high regularity even though the ob- servations are very widely scattered because of other factors or er- rors in the data. Camputing a coefficient of correlation is a much— needed corrective for this. SCATTER DIAGRAMS SHOWING RELATION OF LAND IMPROVEMENT AND LOCATION TO LAND VALUES RESULTS OF CORRELATION BY COSS-TABULATION $ 25c r . r- 0.82 F 6. IA 0 FIG.2 = 2.46)! 2 — 50 2 5 o Sandy36 - , 1: 5 x C/dy 25 200,— ' Loam [.9 = 500 0 . Q . It. g I75 ' " 5 1.50 U .l o g o c . V f O x X , ” § ISO . . I: v r z 1'00" Simple correlation V q r: 5/3212 {mean} . m y_ s y ‘c '25 I c = 350? t x O K / & IOO (x t P 3 300° K I 8 3L U: ra—o D V [o '2 D' to ’9"..._....o o o Is nce _. g 75 x x ‘2 250% heldconst. {F E x / ° % :‘Soil type > 50 - 20° / 9" held constant '/ IL“ 00 25 / n I50 / X= 62. 5°/ {Inca/L) 400 I0 20 30 40 50 60 70 80 90 IOO Farce/21093 of‘ Improved land —$250 ' I I X=ll 6m1 {méan} 5° 0 " FIG . IB ' 225 ' \x . ‘ 20 40 . so so :00 \\ Percentage of Improved land 220 < ‘ = Q» .\.~ ' E) I 5 \ x .0 a ° 0\ o x ’. » " K V L J, >\\x ‘. X Q) I50 0 ‘5: D 0' °\ ° 0 a x ' c. Y b '25 =5] ./2/ can) A V _ Q I \< o S o 1 " “Q 5‘0 k I O 0 ° ‘. . o 0 00 X \ 0 05) 75 x “x c \\ \ E x 0 5 5° r- —a77 y- = — 7 46‘ x " \ 25_ 0 Sandy 36 00 a? x Clay ._ loam /.9 X I 1 2.5 5 7.5 I0 I2.5 I5 I15 20 22.5 Miles from lawn 1 a“ | " “34%." 'v n. V": I .‘!‘~ 4:“ 9&1 l -.: '3 The equation for distance from'market is 6 I 7.46x. Thus with each mile away from town, land values decrease $7.46 per acre. The scatter is a little wider in this case. The effect of soil type cannot be measured like that of the other two variables because soil types cannot be quantitatively expressed. Association analysis would haveXbe used for this. A common and commendable practice in such analysis nowadays is the construction of double—freQuency diagrams, as illustrated in A a B of Figure II. It is Best to make these at the start before any co- _efficients are computed. The coefficients can be computed later if they seem to be needed, and the results added to the diagram as in Figure II. These diagrams reveal the-amount of scatter, and strongly suggest, although they do not demonstrate, the existence of linearity in many cases. Soil type could have been given a separate scatter diagram: but the scatter would have been significant in one dimension only. ‘Combining soil type with the other variables by means of symbols reveals this scatter and some important relationships besides, as will presently be explained. Testing out possible inter-correlation between the three in- dependent variablee is the next step. This can be done directly for im- proved land and location by calculating the correlation coefficient and regression equations for them: but a better procedure is to construct a double-frequency scatter diagram, or perhaps a double~frequcncy correlation table such as Table D following: ' 'A Table D 3 8 :' t = i : : z j : : : ; 8° - 100-° = +++++ : ++++ = +++ : ++ : ‘ ' : 4t : : : '6‘ ‘ V’ . ‘.+++++ ’+++++ ‘+f+++ ‘ e . _ . . . . . s ; 6° 79-9 ;. ++++ 2+++++ 2:311 1*“ 2 p . . ' I : T. 2 . E I Z “ I f ' I I E I ‘- Q I ; I ; 334°— 59-9 3 i +++ 3+++t film a “a I I I I I'Hl I 3; S Q . '“E 2 5 : ‘ ‘ : : : : : g :3}! 8&9 : : x +~f s++++f : 2 : ' g : z : : a : a : : :T : Q Q o — 4.9 g 5 — 9.9 :' 1044.9 1549.9 ; 3 i 3 I i : Distance from Town 224 This reveals clearly the fact th€b~the percentagprofelanASInsh~ proved falls off rapidly away from the city. Figure I B shows that the..a soil is predominantly sandy away from town and that most-of the loam soil is nearer town. Figure I A shows that the loam soil is more largely is» proved than the sandy clay soil. One must conclude from this that a cone siderable part of the apparent correlation between land improvement and land value shown in the coefficient is due to the fact that the me im- proved land is nearer town and has better soil. The rate of regr sion is also increased for the same reason. These same interrelationships likewise raise the coefficient of correlation and regression rate for distance from town and land values. A coefficient of association for soil type and land values would be raised for the same reason. the good soils being both nearer town and more improved. Double-frequency corresmg lation tables like Table D would show this in more definite fonn; and . also demonstrate how such tables can be used with qualitative classifies» tions when scatter diagrams cannot. ’ Thus far we have only proved the existence of interrelations. that destroy the significance of our simple measures of correlation. Wb'” now have the task of eliminating the effects of these interrelations. The simplest, but in many cases inadequate, method for doing this is that of sub-sorting and cross-tabulation, illustrated by Table E, which is like Table D except that the land values for each square are averaged and re- duced to ratios of values according to both of the variables. The data in square "5.0 to 9.9 miles" and "60.0 to 79.9 per cent" indicate that the 10 farms that belong in this square average $155 per core, which is 87 per cent of the values for the first set of squares to the left. which are taken as the base for distance from town, and 286 percent of the values in the row at the bottom which is taken as the base for percentage of land improved. The feature of this table by virtue of which it accomplishes . its purpose is that the ratios in each row or column are figured on the basis of the values in the same row or column“ Thus the $165 in the~ square mentioned is figured as a ratio of $189 in plece'of $217. In this way the effect of decrease in percentage of land improved from $217 to $189 accompanying the decline from 80.0 —lO0.0 to 60.0-79.9 in percentage improved - is eliminated from the comparison. The $145 is also compared with the $189. When there are no values next it in the-same row or column, as for the square "lo-14.9 miles and 40.0~59.9 per cant". be ratio can be figured. The next square to the right of this. however,’ can take this one as its base, calling its ratio 76 instead of 100, borrow- ing this ratio from the other-squares in the same colunun In the table, the ratios thus borrowed are enclosed in parenthesis. Table 3* : : 100 : 79 : : : : : Class :0 217..-.r. ‘ 1'73 : : -: g“ :' : 80-100 :8 (5)' it} (4) z (3) = (2) _-: m 2; : a : 1 : : z : ‘Q; 100 ‘Q; 87 :lh (76) : 60 : : '8 : (so-79,9 :g 189 :8 165% 145* :8 114 :33 : 3 : :~« (4) :~4 (10): (16) :“‘ (8) : “1 :i d : : : : . : : :3 e : : : : : : : E : » : : : 76 : 52 : : 5‘: 40.59.9 : : :E? 116 :5? 89 : E? : H ' ' : :N (4) :“‘ (13) : N z m : : :' : : : : o o o o I O O I q) 0 I I I I 0 O m_: : : .: =c> (56) : <3 3 3 : 20.39.9 : : : (2) :9. 37 : 9. : g o o 3 o in (6) s I a I I I I ." I I 81 : : 0.9—4.9 : 5.0—9.9 : 10.0-44.9 :* 15.0—19.9 : : : : ‘ : : : : m x, : : : : :' : g) a? : Average : 100 : 83 : 76 : 56 :.§ : 1_ : : : : : ‘ 2 Distance from Town _ The ratio of 52 that goes with $89 per acre was calculated by dividing $89 by $116 and applying this ratio to the 76. In isolating the effect of land improvement, the 265 ratio was calculated by dividing $145 by $116 and applying this to the 240 taken from the other square in the same row. The 286 ratio for $165 is the average of the 265 and 308 ratios in the same row. The more squares that can.be averaged to get such a base ratio, the stabler the comparison. . ’ ' The average ratios at the bottom and to the right are simple av- erages of the ratios in the squares. Including or omitting the ratios in the parentheses will have no effect on the results. It is evident that in a crude way the effects of the two variables have been separated from each other, that in the one case distance from town has been held constant, and L=in the other, percentage of land improved. The ratios of values with simple "(two-variable)'correlation analyzes were 420 for percentage improved, and 44 for distance_from town. The first has been reduced to 295, and the second to 56 (inverse correlation). The broken line in Figure II shows this graph- ically. Land values do not rise nearly so rapidly with percentage improved when the effect of the nearness to town of the more improved land is removed. Also clearer indications of non—linearity appear with the higher percentage of land improved. ' ' *Anerages are omitted in squares having less_than 4 farms. .226 The crudity_of this analysis largely grows out of the fewness ‘of the observations. If enough farms were included to give stable averages 'for all the squares, the ratios would tend to be the same in all squares in :the same rows or columns..and not as far apart as 265 and 308, or 79 and 87, for 52 and 60. All the ratios for percentage improved arezhigh because the ~ $37 average in the lowest group is accidentally low partly as a result of ’being based on only 6 cases. Taking the next square above as a base would 'make the ratios 100,-127 and 143 for the effect of land improvement. Group— “ing by 3's in.place of 4‘s, as in Table F, gives still different, and prob- ‘ably more dependable results, since the averages in the different rows and ‘.columns are more stable. Given cases enough, changing the number of group- ‘ings and Varying the class limits would change the ratios very little. Also ’the more cases, the more groups that can be made, and hence the more detail ‘thot can be shown. Shifting from.4 to 3 groups has covered up some of the . :detzil of the linearity shown in Figure II. Table F. , : : 100 Q' 85 7; 60 'E '2 e : 71 a over :‘9‘3 195 :3; . 16? :: 'é‘. 118 : E : E : ' : (15) :. (4) _: (4) : : 8': : Q : g g S : : :G5 100 : “J 79 : 62 : : a: 512-70 :5- 159 :g 126 Ag 99 :kg : a : :~« (15) : (18) : F' (7) : r4 : M : : : : : : o : : 1 : : : % : z : (82) : 66 : : 3:50&less: . :0 69 :8 58 :8 : g : : (2) :3 (4) : H (11) ~' : H : 2 : : J! : - . z ' : 3 : : .‘ : I“ : " : z : : 10 & less : 11 to 15 : 18 & over : a, : : : : t z : 2 : : , 3 : : : 2 -: : AwermgL, : 100 : 82 : 62 : 4 2 Distance from Town I As with the grouping on one independent variable, no definite ex- pression of degree of correlation is available from such an analysis: and al- though a line or curve could be fitted mathematically to the ratios, the re— sults would be valid only if other possible factors affecting land values were gnot correlated with the two variableS'taken, and if enough cases more taken so 'that even those factors not correlated will balance out on-each side of the regression line'or curve throughout its_whole distance. .ated with both distance from town and land improvement. line in Figure II show how holding soil type constant will reduce the apparent effect of land improvement on land values when the method of cross-classifica- . tion is applied to the problem. Soil type would appear to have more effect than distance from town in raising land values along.with percentage of land 'improved; but the averages in the squares are so unetable that one would surely not wish to insist upon this. Soil type is correl- Table G and the dotted 227 ’ Table G- 1 r x .1 o I. II II II II I. .l I. I. I. I. I. II I. I. I. II I, I. II II I. I. Of II II IL I. II I. II I. I. II I. II II I. II II I a I. 'I II II II TI II I- S I t 6 mm m 0 Em m -9 mt . mm mom 8H «2 m mm m: mm mm 03 .24 o r tlf. QC on I. II «I I. '0 I. it ‘0 II II .5 3| 0- 0' 00 It I. in II I. OL mnt II In I. I. II on It Co on It I‘ on on I. 9! I. ‘0 l0 .1 to IQ‘QI II it c. I. 0 w s r 6 Va ) f 615 d 7 6 8 9 4.) 8 4 ) \I d—Dg) 534.. 600— \.I 8 B a aw I.\ l/\ 1( I\ 7 m 7 5 9 7 2 1 5 8 1 2 6 ctf 1 l\ 1 [Ix .I‘r /\ “ml S .\.. . . . . ea man may a m w 2 g E I. I. II II I. .I I. II ‘0 II II II II II OI II II I. I. II II Cl 0' M s O I. i, I I. II II I. II II II II I. I. II II I. I. II II II II II IIrI I. II .a .mw e, mum 1...? 951 96 P4( 8 .ltr l 88/\ 72/\ 866 /\ 8 C /«\ 1 (A ( fiin. H C ( 1 1 II\ MR mom 9: Msm a“ w. . mm mm “mm: . ‘0 .6 II no cc 00 0| II or I! IV or. In. I! It It. It I! 00 I! I. 0.0. no .pto u b to o).or II II OI II In on I. to It on I. 00 0t 0' I. it II II It, OCT! II,‘I %.lt m a . . .w m g T ‘ . t \J O 1 .1 X m \l O 5) 0 \l O 7 \I 0 mWme WMH mum?) 0 ie a m 2 06m 0M3 004 0 . 010b< 111. 11(. 1 mfimwn L c 11( 11( 11—.2( 1 L s a m .m 9 . a tt mm 03 I. I. II II II 'I I. II II II L. I. I. II II I. II I. I‘ I. I. I. O. a s I. O, .4 II II II I. II II I. I' II I. I. II I. I. I. 'I I. II II III .I I. Shi n . n. . e td .1 H-“ Sm 603 max t. . _ . . . .m b 0 .1 s S 9 t H Er» f .m n C 0 .m m e150 1 a .1 s MMfi 1 L 9 9 We L on Ow 0.» we a o e .m S 9. 4. 9 - 9 . s W 9 9 9 a T c s 1 l v . r s 1 7 .5 3 r Me v. a EU _ 9 4 e a _ . . . e h l 1 O . . v 1 O O M m v t a C 1 l 5 0 A C 8 6 A f e . 0. on. on_ to to no on I: o- 0. tv o. I. 0' 19 I. on In It I. O. l. I. II 9. 0| 0| .- I0 I. '0 II to no n. it 0- no. 00‘ at 0- no .00 II It on o. I. O. 09 n . e d damn 626.53.. mo mmMpmmohwm t a m EBB 80.; 8.3939 ”223 Tablesl, J and K put in parallel columns the results of‘the.group-’”" ing on one variable of each of the variables with land values and of the. two cross-tabulation correlations for each. . Table r Percentage : Average : Ratios of Values : of Improved : Percent : Grouping on : Distance { Soil Type gand : Improved : One Variable : Constant : Constant 20439.9 : 29 : 100 : -lOO : 100 40.59.9 : 52 : 222- z "240 : 202 50.79.9 : 66 : 336 : ~286 : - 263 FO.99.9 : 88 : 420 z 295 : 275 ~ Table J 1 Distance : Average Dis- : Grouping on :Pcrcentage of: Soil. from Town ' : tense from- : One Variable :Improved Land: Type ' : Town : ‘ :COnstant : Constant~ 0-4.9 : 2.2 . 100 : 100 : 100 5'909 : 7.3 = 76 3 85 3 88 10-14.9 : 12.4 : 65 : 76 : 69 _;5-13.9 : 16.9 : 44 : 55 : 42 Table K . Soil : Grouping on : Distance : Percentage of Type : One Variable 1 from Town : Improved Land : ' : Constant : Constant Loam V: 1001 : 100, : - 100 Clay : 72 z 81 : 82 _§§ndy : 64’ : 78 : 68 The next step required is to hold two Of the independent variables a constant simultaneously and see what effect is left for the third. Since ; all three variables are correlated, the ratios holding only one constant all show too pronounced effects. This could be done by splitting each of- the squares in Table E into three parts for the three soil types and summar~. izing the results as before; But three times as many farms would .be needed to give even as stable results as with two -independent variables at a time. If a four-th variable were involved in the problem, the squares could be . split again} but this would-again multiply the number of cases needed. This points to a mjaor limitation of the cross-tabulation method of correlation. Many observations are needed if more than two variables are to be isolated, especially if the results are to be stable. 229 Another possible_method of attack on this problem is that of partial association outlined in the preceding section. It will give definite coefficients.of_partia1 association but no rates of regression as ordinarily presented. A third method of attack isuthat of linear or curvilinear mul- tiple and partial correlation. The linear treatment provides no way of handling satisfactorily,the qualitative factors such as soil types, unless enough cases are available so that a separate analysis can be made for each soil type. A procedure sometimes employed is to assign arbitrary measures to the qualitative classes, preferably in the order of their effects, ap- proximating their'true effects as nearly as can be surmised in advance. Let us suppose that the_following measures were asaigned;p,Sandy 1: Clay 2; Loam 3. a The multiple correlation-coefficient'obtained would be raised if using these rankings did not introducegerrors‘of measurement_greater than the effect of soil type itself.3 These would.be the'further errors arising from putting individual farms'in the wrong Soil classl' If the errors of measurement were large, the not correlation.coefficient‘forlsoil type would .also be neglible. ‘So far‘as the_regression equation is concerned, the re» sults would reflect the usual effects of errors in an independent variable*. _ ‘When curvilinear multiple correlation is used, the effect of qualitative independent factory upon the soil type can be measured fairly accurately if enough observations are available. The different soil types for example, would be ranked according to theremexpected effect; Then, when curves were fitted by the'apprbximation method, if the excess of a ‘ .loam over clay was greater than the excess of clay over sand, that would . .Lput a bondnor lump in the curve_at the point corresponding to clay. If :there were‘seVeral-different cagegories, the curve.would become only a :brokon line, showing.the separate relation of each qualitative category upon the independent variable, . : ‘vr%'<‘ ' ' ' The data of the 80 farms were anhlyzed by multiple methods, soil type being included as one variable with the following scale: loam 100; clay 82; sandy 73, This was obtained by averaging the two series of ratios in Table K with location and land improvement respectively held constant. It was realized that these measures exaggerate the effect of soil type, but no better measures werefavailable."""' ’ The results obtained were as follows!**'-' Xd=Land value per acre, .‘ .'0 X1=Percentage of improyed land'~ . . XgaDistance from toWn-- ”"- -' K3-Soils ~‘Clay 82: Sandy 72; Loam 100 Gross r - XOX1-.82 ‘_ , net r: xox1;xéx3_4 €96 Xena-.77 . ; ' -: xoxa.nx3-’i .79. ‘ X°X3-.46 . H a , ..x°x5;x1xga'**,791 . 2 ' ~ ' (a .v ;' . , ‘.9: The lubJoot or the effect.ot error. of measurement 9 V -w sults is described'hy Dr. lfieknel-inraalhter'sectionv ., , . a ** Multiple correlation method is discuheed in later sections. . n correlation re— 230 Net regression coefficients: XOX1.X2X3- 1.774 ‘ xoxz.xlxg- 3.809 XOX3.X1X2- 1.538 Multiple correlation coefficient: 3 a .9688 Regression equation: Xo- 1.774X1- 3.809X2- 1.538X3a 59.09 The net'correlation coefficients are all higher than the corres- ponding gross coefficients. It is interesting to note how much higher the net coefficient is than the gross for soil type. Apparently in the simple cerrelation analysis the other two variables largely covered up the effect of soil type. The multiple coefficient is very high - higher than would probably result with actual data. Table L attempts to bring together for comparison the rates of regression resulting from the analysis thus far. The range in percentages in Tables I and J have been reduced to averages, using the average of the last two groups as the upper limit, and then con- verted to dollars per acre. The regression rates have-been greatly reduced from the simple correlation stage, and also from the cross-tabulation stage. Soil type has absorbed some of the difference, but more of it has disappeared because no longer assigned to more than one variable at a time. Table L.-Bates of regression. Cross-Tabulation :Multiple Simple Holding One Variable :Correla~ Correlation Constant at A Time :tion Percentage of land improved 2.46 2.0 2.2 1.7? Distance from town — miles 7.46 4.7 6.3 3.81 . Soil type-scale of percents -. .. on e. - 43 a. a. I. u a. o, a. an n o. no on n. o. a- an - o- n. .. no .. u ‘- a. no .. n- I. I. .0 ea .. no u The next step is to apply the regression equation to each of the 80 farms and determine the residuals. Twenty-six residuals were less than $5, 25 between $5 and $10, 15 between $10 and $15, 7 between $15 and $20, and 7 were over $20. This was possible with a multiple correlation as high as .9688. Comparison of_these residuals with the original data shews that the six farms in one group along a lake shore average plus residuals of $15 per acre. The four farms within two miles from town are perhaps forty dollars per acre higher than those nearest them. = It is proper to ask the question whether-these farms should not be.dropped-from the analysis since there are not enough of them in the sample to make a separate variable for them. The residuals for them are significant, but not accurate determina» tions since not derived simultaneously with the other three variables. 231 ~ Ayeraging the residuals by soil types indicates that the scale V used places the sandy and loam soils each a little high and the clay soils too low. The following scale-would have fitted better: 100,89,69; that is, if a straight line is to be used to describe the regression. It must be remembered that these measures for soil types are merely relative to each other, and that the regression line is fundamentally only a function.of the scale of values assigned to them, although partly substantiated by the cross-tabulation analysis. If distance from town and percentage of land improved could have been held constant simultaneously. the scale would have been, let us say, something like 100,92,80: and the regression line would have been steeper in proportion. It will be remembered that the cross-tabulation analysis stronge 1y suggested non—linearity in the effect of improved land. Plotting the residuals for this variable and averaging them by sections along the re- grossion line reveals a small amount of this still. At from 50 to 70 per cent of land improved, the land values average about $5 per acre above the regression line, and above 80 per cent, about $8 under the regression line. Most of the apparent curvilinearity of the cross—tabulation analysis proved to be due to soil type and location_thnt were correlated~with percentagp of land improved. One cannot be sure that the curvilinearity still remain— ing is not due to other variables not included in the analysis that may be correlated with land improvement, or that do not average out at all portions of the regression line-because of the fewncss of the cases{ The residuals suggest a downward inclination of the regression for distance from town beyond 15 miles. One strongly suspects that this is due to the defect in the scale for soil types mentioned above. It would seem advisable as a final step to repeat the multiple correlation analysis with the corrected scale for soil-types, and omitting some of the abnormal cases. .If_the residuals still indicated curvilinearity it might be desirable to employ curvilinear methods.“l ' ' A method sometimes used for handling qualitative factore.is #0 leave them out of the first analysis and correlate the residuals with them by grouping them by quality classes. This would not have given dependable results in this case because soil type is intercorrelated with the other variables. The multiple correlation coefficient with soil type omitted was .88, and the regression rates were all too high. In many analyses, the independent variables, or part of them, are very little if any correlated with‘each other; and the fact of ‘this can be demonstrated in advance or at Various stages in the analysis. When such is the case, tabular methods may-be adequate for all practical purposes. In any case, most problems should be pushed as far as possible .by tabular methods before a multiple correlation analysis is undertaken. This is especially true with unexperienced analysts. a; * See later discussion by Dr. Ezekiel. 232 Elementary I It will not be necessary to discuss inlmuch~ time-series detail the use of elementary methods in time series analysis analysis. The foregoing discussion is all pertinent. Given two independent variables entirely unrelated to each other and an analysis such as that of L. H. Bean's in Figure III will serve.* Of course one would want many more than eight observations before—” predicting with much confidence. Establishing that the two independent variables are unrelated will not ordinarily by easy. and will usually be impossible with only a few observations. If, for example, one wished to determine the effect of both supply and industrial activity on the price of cotton, one would probably have to reckon on the fact that the supply of cotton and industrial activity are in some measure-mutually interre— lated, and if so a good many years' observations would be needed to isolate a the effects of the two. Such methods as Figure III represents would give different results even then according to which was taken as independent variable first and which was correlated with the'residualsg Let one as- 'sume that the price-of feeds is the major factor affecting supply of milk and correlate the two and-assign the residuals to the price of dairy pro- ducts and one will get one result; assume the opposite and one will get a different.result. Even if there is no relation between them. one can easily bank too much on the exact validity of the curves. A differently placed supply price curve (1) in Figure III would have given a_different trend curve (2). The two are obviously drawn in such a way that they give the smoothest fit for both. It is very unusual even in normal times to get so smooth a line of growth as curve (2) represents. One would expect such things as the fluctuations of industrial activity to disturb it. One surely would not say from the evidences of these curves that it has not done so. ‘ ‘ 4 ‘ ‘ Nevertheless such methods should be the first to be used in any tine-series analysis. Given observations enough, cross-tabulation can be used. Ordinarily, however. the cases are so few that one has to utilize all of them as fully as possible in_order to get usable results, which leads to the use of multiple correlation methods. raising extremely diffi- cult questions of interpretations of results. which will be discussed later. Variables to . , I CorrelatiOn can measure only the extent and be Included** the kind of relations existing between the two or more series of variables which are subjected to the statistical study. The meaning of the results de- . - - . . pends entirely upon the pertinence of the variables selected to some particular problem under study.‘ Whether the variables selected are pertinent or not is a matter of logic, and not of statistics. Since. however, selection of the proper variables for study is the first ~ a critical point in the study, a case will be given to illustrate some of L1the logical‘processes inVolved‘in-such-solectién. Suppose the proprietor of a cafeteria wishes to determine why the number of patrons varies from day to day. Let us further suppose that the lunchroom‘is in a business district. and serves only mid-day meals. -- 'Chart, III from a paper read by Mr. Bean before the New York Chapter of the American Statistical Association, April 20, 1938. ** By Dr. Mordecai Ezekiel, Bureau of Agricultural Economics. 233 Then the dependent variable — the one the variations in which are to be explained — is the number of lunches sold each day. This is shown to the left of Figure l, which figure may be called a "factorgraph". The factors which might cause the number of patrons to vary may next be separated intwo two groups - those regularly recurring and those irregularly occuring. The first group will obviously include time fac— tors — these which cause regular increases or decreases in number of patrons from day to day, week to week, or seasonally through the year. we may go back of these, however, and try to find what causes them. A half—holiday on Saturday, for example, may explain low sales that par- ticular day of the week; decreased patronage in the summer may be due to hot weather or to absence of patrons on vacations; an upward trend in the number of patrons may reflect a steady growth in the number of persons within reach of the store, decreasing competition due to competitors go— ing out of business, or a progressive business which is out—running its competitors. Holidays, such as Easter, July 4, etc., which occur at stated periods, may also affect the business for better or worse, and must be considered. The irregular group of factors, however, offers the greatest variety. Factors that may affect trade irregularly may be grouped under several heads; (a) those affecting the number of potential customers with- in reach of the lunchroom; (b) those affecting their economic status; (0) those affecting their desire to eat lunch indoors on any particular day; and (d) those affecting their decision as to whether to eat in this par- ticular store. Each one of these heads may be further subdivided into individual items, as illustrated in Figure 2. After laying out all the related elements in the problem, and carrying them out as far as possible, the next step is to select variables which will serve as numerical representations of each item. After these values are secured the relationships among the several groups can be studied, using the factorgraph as a controlling guide as to the particular influence each individual variable is selected to measure, and the way in which its influence is expected to be exerted. Then when the statistical analysis of the relations among the various factors is completed, the in— vestigator will be able to interpret logically the significance of each of the variables whose relations he has measured. Statement Once the variables to be employed are selected, of Variables* the next problem is in what units to state them. In studying land values, for example, the value of a given farm may be stated as total value, as value per acre of all land, or as value per acre of improved land. Which one to select depends on what other variables to include and how they are to be stated. On the per—farm basis, the total value of the farm might be correlated with the value of the dwelling, the value of other * By Mordecai Ezekiel. .dwhwpwwuo a mo mmqumnp mqwuommmw whovomm now 3 nmmhmpouodm .H ondmwh when and mhmnkwmam cayenne“ mooH>umm / umHnnpm owoum on» nH was chuckmwam campanu mmoHpm macaw nunsa mafipumaoo mo aofiuwooa can nmpasz ”noun Haas Iothmm on» on ma nowmwomn. oqwnmnfiw Hmequm . mnwnquH on ma mmpwmon mhdpwnomeoe mmH>HH mo pmoo aw . mmeOHmSo haanmuan and Ho>oH chnm awapeh.qH mmwndno HdeQopom Mo mspaum ofiaonoofi dm>wmomu wows: aw mawqdno /—.——_.~. . ‘ haw new manmonoo Boa mo pqmanmwapdpmfl mhmao»wso HaHpnopom mo dunks ‘ . . // cophom wonoaun mmufioflfidp kwn mo nowpomum , . ‘ . _ comOHmEo muonssn aw omnaao meOpm mdmpomaou aw ducks , . unmanau . camps. mkofiovmfio Hawpnopoa mo.umpEdu nw cache .Opo «mmac hum .mhuuHHoA HwHooam maowpuodb no unoppam Mo wonomnd anomdm . x w w mafiphaoo» a assume: dw womnwnozamnomuom _ hflpaagwmm #mmkloplxm m3 .. . finSHom W mm? 3..th 235” buildings, the acres in cultivated land, the acres in pasture, etc. This would tend to show the contribution pix; acre ofeach. of the acreage ele... ments, and should tend to give a high correlation, since under normal? conditions the value of the farm may be expected to approximate the value of the buildings plus that of the several tracts of land. But if it were desired to measure the influence of_type of road, fertility of land and distance from tawn, on land value, they could not be so readily included in the same additive equation. To do so would assume that a given increase in fertility or yield per acre would add the same amount to the value of the farm no matter how large or how small the farm was. ' ‘ _ If the value were stated as value per acre, that would partly solve the difficulty, for a given change in fertility, distance from town, or type of road, would then be assumed to have the same influence upon value per acre no matter how large or how small the farm use. But that would introduce difficulty with other variables. The dwelling, for example, would not become larger in direct proportion with the size of the farm. Very large farms with good dwellings would have a very low "value-of-dwellingsuper-acre," and small farms with poor dwellings would also have a low "value—of—dwellings-per—acre." Only some method of determining the effect of value of dwellings on land values separately for farms of different Sizes would take care of this difficulty.* The case mentioned also illustrates the need of something other than an arithmetic regression equation to express certain cases. If it be assumed that the_more fertile the farm. the greater the effect of nearness to town would be, and that the nearer to town, the greater the effect of an increase in fertility would be, that could not be ad equately expressed by the regression equation: ' Value per care - f(distance) f(fertility) etc. Instead, the multiplying effects of the two variables upon the Value could be allowed_fgp by using the equation: Value per acre = [jf1(distancei] [f2(fertility2’l [f(etc.i) ' which, for the actual process bf computation, can be stated: logarithm (value per acre) = ¢1(log distance).f¢3(log fertility)f fl3(log etc.) This logarithmic equation, which puts the relations on a relative or proportional rather than an absolute or arithmetic base, is ',a very flexible one and One that can be used in a great many types of ,, n problems. ‘ * See the appendix of U. S. Dept. Agr. Bul, 1400, Factors Affecting Farmers Earnings in Southwest Pennsylvania, for an example of sta». tistical treatment of a problem of this type. . 236 One further consideration with respect to the statement of variables which needs to be mentioned is the danger of false results or spurious correlation if the variables are improperly stated. Thus, for example, if an'attempt were made to correlate the value of farms with three factors,~namely, (a) the percentage of land in corn, (b) the per- centage of land in wheat, and (c) the percentage of land in all other uses, it would be impossible to solve the problem, or else it would give a spurious result, because the three factors (a), (b), and (0) would add to exactly 100 percentage in each individual case, and after variations in (a) and (b) had been held constant by statistical means, there would not be any roam left for variation in (0). Only by dropping out one of the factors, say (c).would significant results be secured. The, regressions on (a) and (b) would then also show the effects of (c); for example, the increase in.vslue for each unit increase in (a) would mean the increase due to substituting one unit of (a) for one unit of (c);' changing the sign would give the effect of substituting one unit of (c) for one of (a). The same principle would then apply as between (b) and (o). Choice of . -.V 'In using correlation analysis, there are two Method* _ types of alternatives to be considered: first, as to the number of variables to be used simultaneously; and second, as to the type of relation betieen the _ variables to be assumed, i,e., whether linear (a con- stant unit change in the dependent variable for each'given unit change in‘ the independent), or non-linear. , H Where the variations in the dependent variable are to be related to variations in two or more independent variables, the problem arises as to whether simultaneous (or multiple correlation) method must be used.to determine the effect of each one. ' H The simplest method of treatment is to ignore the fact that the result is due to two or more causes acting simultaneously, and determine by simple correlation the changes in the dependent variable for each change in one casual factor, the the change for each change of the next causal factor, and so on till one has been taken into account. For this method to give an accurate measurement of the relations which really prevail, two conditions have to be fulfilled; first, no cor- relation between the several independent factors; and second, a large enough number of observations so that the effect of the random.variations in the other independent factors can be syeraged out. This may be illustrated by a definite case. Suppose that the yield of corn is being related to rainfall in July and to temperature in July. Suppose further that there is no correlation between rainfall not: and temperature. Then when a scatterndiagrsm between rainfall and yield. ,. . a ._._.._._l M “.“ggggi -.—,—-4-_--———» 3...; jj:1_!3r,ubrdecei Ezekiel _.. ~ I .. . 4'» .~.AA! 237 £33. cw opgmpmmsmp 95 “:2”. am 3“,?me nwmrump nowadamh 3; was flgoo mo Shot” 98 ~33. 3 :mwnmmh 5253 H8332. Amy mflsoflm magma? pmpumow .m .mfm m mwmpga. 95.5 mask/Z” 2H GHAEZHflH mHmoZH 2H AQJEZHE . N . 0 ¢ m N ‘ o r m . E _ u .3 _ n .. m . _ . m. . m a w . . _ m _ m _ .m w . . _ . .. . w . . allHldT IIWII . a u I QM u— m u o 0 I1 on _, . m . . _ . m m . _ . . _ O . a . . .u.|¢ ., . o. . -rti¢.1: _ _ {I'll . - . _ 1:- I1. om. .. . . m _ Mm \ _ M \ a ._ _ o . _, A D “.7. 1 . . . . \.\ _ u _ m .. . _ IIVlIllvb I IEMIIIIII I C I'll ll. om ~ Qt ‘ — m 0* . _ . __. .W . 4 m W W _ . . m m M _ . “ ~ _ ~ . . A _ . w _ a. m . _ . M _ 804 3m m . muonmmn .Q. . mamflmdm .. 83¢anan 300 Mo Show». 238 is constructed (Figure 2,a), the differences in the average yield of corn between the different groups according to rainfall are not in- fluenced by temperature at all. This is because, with no correlation between temperature and rainfall, the average temperature is Just about the same from group to gmepp according to rainfall (Figure 2,b). Any effects of temperature differences on yield are averaged out when the yields are classified according to differences in rainfall. The regres« sion line in (3) therefore shows the changes in yield with changes in rainfall, with temperature remaining at the average, Similarly, a regression line for these data showing the change in yield with change in temperature would have the effect of rainfall removed; it would show the change in yield accompanying change in temperature with rainfall remaining at the average. On the other hand, with data where there is Significant corre— lation between independent variables, it rs necessary to use some method which determines the change in the dependent variable for changes in each one of the independent variables while simultaneously eliminating the effects of the other indepdent variable. The method of multiple correla~ tion comes the nearest to doing this satisfactorily. Whether it can do it completely satisfactorily or not (in addition to the question of line- arity or curvilinearity, which will be taken up separately) depends upon how well the form of regression equation used really expresses the way in which the influences of the several independent factors combine to produce the variations in the dependent factor or result. If the effects of the several independent factors simply add together to produce the result, then the simple additive regression equation: x - a+ f2 (x2) + f3(X3) + (etc.) can‘Pe used to determine the separate effect of each factor. If instead the effects of the several factors multiplying together to produce the result, then the logarithm equation: log X1 = a +-f2 (log X2) +gf2 (log X3)‘T (etc.) will most satisfactorily determine the relations. If, finally, the effects are such that it in the combination of the several independent factors which produces the result, so that there is a particular unique effect upon the dependent for each particular combination of the independent factors, then still more complex equations of the form: X1 = f(x2, X3, X4) etc. are needed to determine the way in which the dependent factor varies with the different values of the several independent factors.‘ * A fuller discussion of this last case, and methods of measuring such re- lationships, are given in "The Determination of Curvilinear Regression *Surfaces' in the Presence of other Variables," Jour. Amer. Stat. Assoc., Sept., 1926, pp. 310-320; and a non—technical discussion of their character- istics, in "Factors Related to Lamb Prices", Jeur. of Pol. Econ., April, 1927, pp. 236-239; both by Mordecai Ezekiel. 239 As regards the question of whether linear or curvilinear methods are to be employed: linear methods are applicable only when it is reason- able to assume that a given change in the independent variable will cause the same corresponding change in the dependent variable no matter how far it is carried. While relatively infrequent in agricultural economic'studies, there are a few cases which are truly of this sort. Thus the area of peas. ture needed is apt to continue to increase with the number of cows to hm pastured (fertility being held constant): the price of hogs at country \‘.points is likely to continue to change in the same relation to the prices at central markets; and the price of hog products at wholesale is likely to Continue to change with the price of hogs at wholesale; all no matter how far the changes are carried. But in most economic or agronomic rela- tions the reverse is true. Production does not increase indefinitely with additional increments of fertilizer, at least certainly not in the same proportion; profits do not increase indefinitely with larger fanmn or greater capital: yields do not increase indefinitely with heawier refine fall or higher temperatures; nor do prices continue to fall to the same . extent with successively larger and larger production. In such cases the assumption of linearity might completely distort the actual relations, or: even result in a conclusion of "no relationship" in the case of such re-- lotions as that shown in Fig. 4. When it is decided to assume curvilinear relations, the next question is whether to determine them by the free-hand graphic method or the use of a definite equation of some sort. The result obtained by the free—hand curve is an empirical description of the relationship - but so is the result obtained by a definite equation. unless the equation used convoys some lo cal ex ression of the theoretical relation of the var a ables involved. The way in which an equation may express logically some characteristics of the relation has already been discussed under the head of "curve—fitting". UhleSs the constants of the equation when determined do have some logical meaning, it is worse than useless to use a definite equation, for the function as determined by the equation may distort the true relation as badly as the straight line would in Figure 4. One further caution as to choice of method needs to be stated: none of these methods provides reliable or stable results unless there are enough observations. When curvilinear methods are used. the significance of the relations varies at different portions of the curve. The results may be entirely dependable within the central portions where the bulk of the observations are concentrated, yet have but little reliability at the extremities where there are only a few cases to indicate the shape or posi- tion of the curve. Inkewise with multiple correlation: it takes more ob— servations to determine the relations accurately when the result is due to a single cause. Further, the more closely the several independent variables . are inter-related, the more observations it takes to measure the effects 'fof each one Separately. Particularly in multiple curvilinear correlation. if the several independent variables are highly inter-correlated it becomes very difficult to determine the exact influences to ascribe to each one un» less a large number of observations are available. With insufficient ob- servations in such eases, the major affect may be ascribed to whichever variable the investigator pleases, since if it is accounted for by one, it is then removed and will not be left to be credited to the other. If two 240 variables are perfectly correlated it is impossible to determine what part of thezesult is due to one and what part to the other. If they are closely correlated, only a large number of observations will make separa— tion possible. In time series problems several elements are often trending upwards or downwards simultaneously, and with the limited number of observations usually available. there is no absolute method of de- termining what part of the total effect is due to one and what part is due to another. . No absolute rules can be laid down for the working limits of reliability as to numbers of cases. The usual measures of probable or standard error, supplemented by nStudent's" and R. A. Fisher's tables of probabilities for small samples, Smith's formula for taking account of the influence of the number of variables on the apparent closeness of correlation, and Yule's formula for the standard error of net regression coefficients, should add in particular cases.* Where multiple curvilinear correlation is employed using free- hand curves, no definite criteria of reliability have yet been determined. It is evident that these results can be no more reliable than similar re- sults with linear relations, and in most cases are probably less so - that is, it would take a larger number of observations to secure equally valid results using the same number of variables. As a rule-of—thumb guide, Smith's correction of the multiple correlation coefficient for number of variables might be employed, reckoning each variable whose effect was curvilinearhy measured as equal to two or more variables, depending rough- ly on how meny constants would probably be required in an equation to erpress'the shape of each of the functions analytically. Thus if with 4 independent variables, only 80 cases were avail- able, and a multiple correlation of g a .80 had been obtained, and it were decided that an equation of three constants (such as a cubic parabola) would be required to express each function, then the 4 functions would be , equivalent to 12 linear variables. ‘ Fisher, R. A., Statistical Methods for Research workers. Smith, 3. B., Forecasting the Price of Cotton, Jour. Amer. Stat. Association. Yule, G. U.. Introduction to Statistical Methods. 241 Applying Smith's formula: 13' Where m - no. of variables n a no. of cases p a multiple correlation observed WI 1 j probable true multiple correlation In this case the equation gives: ml‘. 0.67 The adjusted correlation, 0.67 indicates less than two—thirds as close a relationship as the apparent correlation of 0.80, since the closeness of relationship is shOWn by the squares of the correlation.co- ; efficients, which are 0.40 and 0.64, respectively,: - ' Purely as a rough rule—ofethumb, it may.be said that simple correlation results (2 variable) cannot be expected to have much stability unless there are at least 20 observations to work with; linear multiple correlation, with 3 or 4 variables, unless there are 30 or 40; and multiple curvilinear, unless there are 40 to 60. Where absolute limitation of the universe makes it necessary to work with smaller samples, then the rules of error previously mentioned should be referred to in interpreting the, results, and the investigator should be duly modest in generalizing from the results of his findings. . f ' ~ “ ' S'gnificance of H Correlation results may be used either as a Results as Descri ~ description of the relations among variableS‘ tions of Date* . , in the particular data used, or as a basis for , generalization as to relations in the universe lfrom which the date were drawn. Generaliza» tion involves all of the problems which are treated statistically under the general head of "sampling". This present section will'discuss the meaning of the several-different ways of expressing correlation results solely as descriptions of the date studied, leaving aside the larger questions involved in generalizing from them to other data. * By Mordecai Ezekiel 242 'The several different results of correlation analysis may be named as follows: The symbols used for the several different constants are those used in the texts of Yule and Mills, and in the teohnical papers of Tolley, B. B. Smith, and Ezekiel. §ymbol Name Simple linear correlation: bxy __---- ________________ Regression coefficient of X on Y rxy -- Correlation coefficient of X with Y X I a +‘bxy Y ———————————————— Regression eguation of X on Y sexy orag‘ ---------------- Standard error of estimate, in estimating X linearly from Y. Simple curvilinear correlation: X 8 i (Y) ---------------- Regression curve of X on Y ny -- Index of correlation of X with Y Se OI‘a”z ----------------- Standard error of estimates, in estimating X curvilinearly from Y Multiple linear correlation: b12.34..n -------------- Net regression coefficient of X1 on X2, variables X3 to Xn held constant R1.234..n ————————————— Multiple correlation coefficient of X1 with x2 . to Xn, together D1.234 ----------------- Determination coefficient of X1 by X2, X3, X4, etc. combined. r12.34,.n -------------- Partial correlation coefficient of X1 with X2. variables Xx to Xn held constant d12.34..n ————————————— Partial detennination coefficient of X1 by X2, the effect of X3 and X4 being allowed for. 243 X - a I b 1 M f 12.34..n 33'5"” 13.24.,n 3.3 +(eto.) Net regression cognition of X1 0n Ks, X3, etc. to x11. 86" 01‘ka —--— ----------- Standard error of estimate, in estimating 1.234 ;, ——.- ‘ X1 linearly from X2, X3, X etc. to Kn. X1 ‘ f2 (35) --4; ------------ Net regression curve of X1 on X2 - 1H.234: ..n -—-——-—g-—----—--Index 3: multiple correlationbf X1 with X3, X3, etc to Xn. X1 = a.¢,f3(X ) +-§3(X3) etc. Net regression equation of X1 with X2, X3, etc., according to functional relations. Se1 234 orZre---~—-e-~-~-- Standard error of estimate for estimating 0 X1, for X2, X2, etc. to Xn, using curvilinear relations. In spite of the large number of different symbols, due to the four-way classification as simple or multiple, linear or curvilinear, all of these different measures may be grouped into three different classes. These are, first, the measures which show the change in the dependent variable for given changes in the independent; second, the measures which indicate what proportion of the observed variation in the dependent vari- able can be accounted for by variation in the one or more variables which have been related to it; and third the measures which indicate within what limits of error the values of the dependent variable can be estimated when accompanying values of the independent variables are used. The measures of the first type may be illustrated by reference to figures 3(a) and 4. Figure 3 shows, by a straight line, the average yield of corn for any given inches of rainfall. .These yields may be mathematically expressed by the eguation 2: a straight line. Y-- a byx X, where Y gives the yield of corn, and X is the inches of rainfall. This equation is known as the regression or estimating equation. The coefficient _, byx'-which shows the ayerage increase in yield for each increase of one inch ‘. rainfall, is known as the regression coefficient of yield on rainfall. If the analysis were turned around the other way, so as to estimate rainfall knowing the yield, the regression coefficient would then be " showing the average difference in rainfall with each increase of one bushel'in yield, and would be termed the regression coefficient of rainfall on yield. It should be noted that b xy ordinarily cannot be computed from byx. but must be de- termined separately by a separate_computation. Only in cases.of perfect correlation are the two exactly related; then one is the reciprocal of the other. 2““! Corn Yield Temperature , Bu.Per Acre A Degrees .... B I ! f I ‘ I I 1 | i I ,- 0 ' -'i 40 4 35 ' , o . .l o O I O . l 30 .._.-__-iif.__ a g i i 250 1 2 5 4 750‘ 'fi 2 3- 4 Rainfall in Inches Rainfall in Inches c g U.S.Dept.of Agriculture “rjdp Averages Bureau of Agricultural Econ. Fig. 3. Scatter Diagrams showing (A) relation between rainfall in July and yield of corn, and (B) relation between rainfall in July and termerature in July Corn Yield Bu.Per Acre 20 0 2 ' 3‘ . ., Rainfall in July, in Inches Fig. 4. Hypothetical change in yield of corn with differences in rainfall. 245 Where there is a curvilinear relation between the variables, such as illustrated in Fig. 4, the changes in one variable for changes in the other are shown by the regression curve, Y = f(X). This curve may be shown either by an analytical function, such as Y = a b X cX , from which values of Y can be computed for any given value of X, or by a graphic curve, or by a table from which values of X may be determined for any particular value of Y. In Fig. 4 for example, the average yield of corn is shown to be about 25 bushels for either 1 inch of rainfall or 6 inches, about 30 bushels for 2 inches, etc. Where there really is such a vurvilinear relation, the regression curve obviously gives a more accurage indication of difference in yield with differences in rainfall than would the straight reggession line. Where there are two or more independent variables involved, the regression coefficients have a parallel meaning. For example, if in the equation: 3‘1" 8‘ b12.3 x2 b18.2X3 X1 is yield of corn in bushels, X2. is rainfall in inches, and 13 is temperature in degrees, the regression eguation as a whole would show the average yield of corn for any observed combination of rainfall and temperature. The constant b12.3, known as the net regression coefficient of XI on x2, shows the average increase in yield for each one inch in- crease in rainfall, the effect of changes in temperature being removed or allowed for; and similarly the net regression coefficient b13 2 shows the average increase in yield, X1, for each one degree increase in temper- ature, X3, with the effects of differences in rainfall allowed for or -taken into account. Where these net relations have been determined as curves ins— stead of straight lines, the estimating or regression equation is stated: . X1 = a £2 (X2) £3 (X3) Here the value "£2.(Xg)" indicated the particular change in yield for each particular difference in rainfall, the effect of temperature being removed, and "£8 (X3)" indicates the particular change in yield for each particular difference in temperature, the effect of rainfall being held constant. Again the equation (x1) as a whole indicates the aver- age yield observed with any particular combination of rainfall (X2) and temperature (X3) according to their separate curvilinear effects, rather than their linear effects. The methods of estimating values of one variable from known vvalues of one or more other variables, Just discussed, form the basis for all the other measures of correlation which have been stated. _ The second group of measures all indicate the closeness of the relations by showing what proportion of the variation in the inde- pendent variable can be explained by the one or more variables which 246 hare been related to it. This may also be stated as what proportion - of the original variation can be reproduced in estimates made from the other variables. These measures may be divided into two sub-groups; (a), those which determine the proportion by stating the standard de- viation of the estimated values as a percentage of the standard deviap tion of the original, and (b), those which state thee sguare of the standard deviation of the estimated variable as a percentage of the sgggrs of the standard deviation of the original. The first sub—group includes the coefficient of correlation, r , which measures the proportion for straight-line relation between two variables; the index of correlation , ih1oh measures it for curVilinear relation between two variables, according to the particular curve employed; R1 334 n the coefficient of multiple correlation, ."shich measures it for the relation between one variable X1, and the two or more others X2, X3, etc., where only linear net regression lines are employed, and P1 234 the index of multiple correlation, which measures it between X1 andn several other variables, where regression curves are employed to represent the net effect of one or more variables. for both the coefficient and index of multiple correlation, the symbols 'to the right of the decimal point in the subscript indicate the Vari-, , ables which are being used to estimate from, and the one subscript to the left of the decimal point ind.icates the one variable whose Verne is be- ing estimated. . . . \\ , V _ ~ ' There is one further coefficient of the firs t sub-group, the coefficient of net correlation, Ti? 34 which must be explainqg'sopar- ately.\ This measures the closeness of relation between the we vari— ables X1 and X2 when each is first adjusted to eliminate the effect of correlated changes in X3 and X4 It may also be stated as a measure of the extent to which the variation remaining in X, after X3 and X4 have been accounted for may be further reduced by introducing X3 as a factor in making the estimate. However, it is not a direct measure of the proportion, but a rather involved mathematical function of it. The second subgroup of measures are given the general name of measures of determination. This name is used because in certain extremely simple cases, where one variable is known to be composed of several elements some of which are and some of which are not represented in another variable, the determination coefficient measures the pro- portion of all elements in the first variable which are present in the second. Where the variables are known to be causally related, the measure of determination may be used to indicate .the extent to which one variable is determined by another by showing what per cent of the variation in one can be explained by the other. This is par- ticularly'uSeful, because the difference between the determination esoffieibnt and 100% always gives the proportion which is not account- ed for by the Variable or variables concerned - which is not true in the case of correlation coefficients. a The determination measures, by definition, are all simply the squares of the respective correlation coefficients or indexes, and ther09bre parallel them in all cases. ‘ The measure €12 is the 247 determination of X1 by X2, in simple linear or curvilinear correlation, while the meamire D1.234 is the determination of X1 by X3, X3, etc., in multiple linear or curvilinear correlation. The value 13d34 2 however, has a meaning distinct from that of the coefficient of partial or not correlation. It is instead the partial determination of X1 byZX, and shows the proportion of the total (squared) variation in X1 which can be accounted for by varia— tion X2, while simultaneously accounting for other parts of the varia— tion by the net effect of X3, X4, etc. The sum of the partial deter— minations by X2, X3, X4, etc., gives the total determination by all independent factors; thus— D1.234 - 12‘134 13%} Miles However, these values are frequently difficult to explain, and especial care must be taken te interpret them properly.Ill The third group. of measures all indicate how large is the variation remaining in the dependent variable after the effects of the other variable or variables are allowed for, measured not in proportion to anthing else, but in absolute units of the being estimated. Thus in the relation shown in Figures 3 or 4, it would show the average of the number of bushels of corn by which the several actual yields dif- fered from the expected yields. This average, however. would be com— puted by taking the standard deviations of these differences. The standard error of estimate, as it is called, is therefore simply the standard deviation of all of the"errors" or failure to account for the variations in the dependent factor in the group of records covered by the analysis. About two—thirds of the "errors", or differences be- tween actual and estimated values, will therefore come within the range shown by the standard error, and about one-third outside of it. Where these errors are nonnally distributed, this measure also in- dicates about how frequently errors of two, three, or more times the standard error of estimate will occur. The same name is used, the standard error of estimate, no matter whether the estimate is made from one or more variables, or by linear or curvilinear relations. In each case it is simply a direct measure of the actual variation not accounted for by the particular variables used and relations determined. * Fbr details of interpreting partial determination coefficients, see Factors Affecting the Price of Hogs. U. S. D. 1., Bul. 1440, p. 35; Factors Related to Lamb Prices, Jour. Pol Econ.. April, 1927, pp. 258—254, both by Mordecai Ezekiel; and Adjusting Hog Production to Market Demand, by F. F. Elliott. Ill. Agr. Expt. Stat. Bulletin 293, pp. 5544557. 2’48 memsoaoom Hang H833 mo ~82de inaugufluwd mo pnm§n&3.w.b .mmmpma zogflmmmoo mo mg. a5. wEEmBqH ~maqufim gfiag .m .mfim mmflodH 5 HHmmmwdm manonH GMAHmmqfldm . . mmflonH 5.” HHdmnwam « . m z m H m “Tirxklflul. om 1mm M mm mmwnmmm 23 0: +nm ammém Zmoo .mo QHMHM 249 The partioular usefulneSs of each of the three different grasps of correlation measures is illustrated in Figure 5, which shows 3 sets of simple correlation relationships, with various assumed re~ sults. Here the regression coefficient is larger in A than B. In the first case an additional inch of rain causes an average increase 0f 2.5 bushels in yield, as compared with an increase of 3.0 bushels in B, the second case, But in case A, practically all of the varia~ tion in yield is apparently due to rainfall, as-shown by the high cor- relation (.90) and the small size of the standard error of estimate (2 bushels); while in case B, apparently factors other than rainfall cause most of the differences in yield as indicated by the low correla~ tion (.30)'and the larger standard error of estimate (5 bushels.) In terms of determination, apparently about 81% of the differences in yield are related to differences in rainfall in the first case, and only about 9% in the second. In comparison with A and B, case 0 has much less variable yields, ranging from only about 28 bushels to 32 bushels, compared with a range of 24 to 35 in case A, and 22 to 40 in case B. Afipar- ently only a small part (16%) of the variation in yields is associap ted with rainfall differences, as is indicated by the low correlation (.40). An increase of 1 inch in rainfall apparently causes only a 0.5 bushel increase in yield. Yet in spite of this low relation, it is possible to estimate yields more accurately, given the rainfall, in this case than in either of the other two, as shown by the standard error of estimate, 1 bushel as compared to 2.0 bushels for A and 5.0 for B. This is because the original variation in yields is so slight that even the small relation shown to rainfall is enough to make it possible to forecast yields more accurately than in either of the other cases. These three cases illustrate the relative place of each of the three types of correlation measures. Case 3 shows the greatest change in yield for a given change in rainfall (the regression measure); case A shows the highest proportion of differences in yields accounted for by rainfall (the correlation or determination measure); while case 0 shows the greatest accuracy of estimate (the error of estimate measure). To which of these measures to pay the most attention in a particular investigation, depends upon the particular phase of the investigation which is most important; the amount of change; the proportionate import- ance; or the accuracy of estimate. All hare their place and none should be entirely overlooked or ignored. EEEQEE_2§ Errors of observation in the data for the §££2£§_12 , several variables, such, for example, as might be Dais: , _ . due to inaccuracies in estimating or reporting corn yields or rainfall. has a varying effect upon the ‘ significance of the correlation'results. The exact effect depends upon two conditions: first, whether or not the errors are. random and unbiased (that is, falling purely by changs.equally above and * By Mordecai Ezekiel 250 below the true values of the variables and uncorrelated with the values of the Variables); and whether they occur in the dependent factor, the independent factors, or both. If (1) the errors are random and unbiased, if (2) they occur in only the dependent factor, and (3) a large enough number of observations are available to allow the errors to Canoel each 'other out, then their presence will lower the correlation coefficient and increase the standard error of estimate, but will have only a negligible effect upon the slope of the net regression line or the shape of the net regression curve. In either simple or multiple correlation in an ideal case, the values of the regression coefficients would be exactly the same as if there had been no errors of observa~ ticn in the dependent variable,_but in practice there would rarely be xsuch perfect cancelling out of the errors as this. 4 In case the random errors are in one or more independent factors, with none in the dependent, they would have the effect of reducing the correlation, increasing the standard error of estimate, and in addition flattening out the regression line or curve for the variable (or variables) in which the errors occured to a less marked slope than it should have. Stated another way, the regression co- efficients for the inaccurately observed independent variables would show smaller values than they would have.had if the data had been accurately recorded. ’In addition, in cases of multiple correlation the net regressions for those independent variables which had no errors of observation might be slightly altered, either up or down, due to the failure to make adequate correctionlfor the effects of the variables which were inaccurately observed. Such differences in these regressions, however, would probably be minor compared with . the reductions in the values of the regressions for the variables which were themselves inaccurately recorded. In cases where both the dependent variable and one or more of the independent variables have errors of observation, the effect on the.correlation result is a combination of-the consequences men— tioned above. The closeness of correlation will be reduced, and the standard error of estimate increased; the regressions on the independent variables which are inaccurately observed will be reduced; but the regressions on the independent variables accurately observed will not be affected by the errors in the dependent variable, which, given enough cases, will tend to cancel out,-but only the the failure to allow properly for the effect of such of the il..pendent variables as were inaccurately observed. . ' The way that'random errors might actually work-out in practice may be illustrated by an actual example.l Over the 17 years .from 1907 to 1923, the monthly price of lambs shows a very high corre- ' lation with the price of wool and the price of dressed lamb. Using X3 for the price of wool in cents per pound, X3 for prices of dressed ‘lamb in cents per pound, and X1 for the price of live lambs in cents per pound, multiple correlation giVes for the 204 observations: 31.23-‘991' and-3:13.144x2+.354x3 251 To test what effect random errors would have had on this correlation, two dice were thrown 204 times, giving random values from 2 to 18. These values were then added to the dependent vari- able, and a similar set of values to one independent variable, to see what effect this would have on the results. In the following tabulation, the notation " X—r e" is used to designate the variables to whose values those "random errors" had been added: Independent : Dependent : Multiple : Regression variables ; variables : correlation : equation X12 and X 3 i X 1 i .991 ; .l44x2 .354 :3 x 2 and x z- ; x 1 c Q .821 Q .112x2 .414 x3 x 2 and x 3 (3% x 1 .953 .1esx2 .277 x3 X; andx3 e; X1 6 .804 :.152xz .306 x3 - These results illustrate the principles set forth above. The introduction of random errors into the dependent variable (X1) reduces the correlation, but does not greatly change the size of the two regression coefficients. It would apnear, especially from the amount of the reduction in the net regression X3, that the errors in this case may not have been completely randomly distributed and un- correlated with Xl, even though determined by throws of dice. ,But the second modification, where the error is introduced into the independent variable X3 instead, is much more striking. The correlation is not reduced so much as in the first case, and the re— grcssion of X1 on X2 is changed only slightly from the original value—- and increased, as it happens. The net regression of II on X3.+ 6, however, is only 3/4 as large as was the net regression of XI on X3, in spite of the fact that the error introduced was only enough to raise the standard deviation of X3 from 6.14 to 6.64. The final case, with errors introduced into both X1 and X3, shows the lowest correlation of any, as would be expected. .The net regression of Xli+ s on x2 is but little different from what the re~ grossion of 11 on X3 was, while the not regression of X1 +~ e on X3-+ 6, though slightly larger than what the regression of X1 on X3‘+ e was, is still quite significantly lower than the regression of X1 on 13. The regression-equagieneig his last case, where X1.t e is the dependent, is not greatly/ rom w it was in the preceding case with 11 as the dependent, in spite of the fact that one of the independent variobles--- 13 ~- had a significant random error of observation in its values both times. 252 These cases illustrate the extent to which random errors may confuse the true relations if they are allowed to creep into the ob— servations. Just how great an effect upon the results such random errors will have depends upon the magnitude of the errors, the original variations in the variables, and the closeness of the inter-correlsp tions. While equations can be derived to show how great a reduction in correlation errors of a given magnitude will produce, they are of little practical use in economic work, since it is usually difficult enough to determine whether there are errors of observation or not, much less to determine what magnitude they have. In the problem given, the significant values determining the effect of the errors are as follows: 5x1 = 3.96 . 6x1 + e a 4.74 Fits-36.14 sz +e=.;6.64 For errors in the dependent variable, the relation between the true and the apparent correlation are indicated by the equation: 2 , a R = 2 if 2 2 .-, 1 e.2.8..n. 1R 1.2.3,,n ( 1 p) . R 1.e.3.n 1 +' 2 .27 5-2 ' 76': x1 This gives what the new correlation would be if the errors were truly random, so that the new regression equation came out as V identical with the old. In the problem given,'this'gives an expected value for R. of .827, as compared to the .821 actually obtained. The practical significance of the principles which are stated here is that if there is known to be a large but.random error in ob- serving some variable, that variable may still be used as the dependent variable in a correlation study without making the regressions or es- timating equation very far wrong, if determined with a large number of cases; but on the other hand, any use of that variable as an independent variable will be certain to yield results which fell short of the actual relations. In cases where the errors are biased, they tend to make the results of correlation analysis more or less in error, quite regardless of the variables to which they apply. If the errors tend either to magnify or minimize the differences which actually exist, they will have a parallel'effect on the regression coefficients if they apply to the dependent variables, and an inverse effect if they apply to an independent variable. There are so many different types of bias, however, that no more definite statement of.the effects can be laid down. . - Random errors have the same type of effect in the case of curvilinear correlation that they do in linear correlation, since if they are truly random they will tend to be balanced out along all the 363 portions of the regression curve alike if they are in the dependent variable; or tend to confuse the relations along the curve if they are in the independent variable, and so reduce the differences ob- served.. . . Biased errors, on the contrary, may happen to be concena. trated along certain portions of the range, and hence have a much more marked effect at one point than at another. While this might seriously disturb the significance of the curve, it probably would have an equally disastrous effect on the reliability of the straight line, About the only real difference between linearity and curvilin- earity with regard to errors is that random errors in the dependent be necessary to secure equally valuable results for a curvilinear regression. . Where, with random errors, there are not enough cases available to ”balance them out", the effect of the errors is to throw a varying amount of error into the conclusions, the exact amount of the error depending on how closely the errors approach being cancelled out. The illustrative case, where with over 200 observations the regressions were still changed somewhat, probably indicates what may be obtained by.a combination of slight departures from true "randomness" in the errors with a sample not quite large enough entirely to eliminate all of the resulting instability. This may be nearer to what would usually happen in practice than the theor-, etical complete elimination of the errors in the dependent variable. Correlation as Certain peculiarities entering into Descri tion of , analysis of geographic series call for special Geographic Data.' consideration. This is particularly true when correlation is used to describe certain attributes of the series.‘. The validity of correlation methods which such data depends upon the purposes for which the data were collected. One purpose, for example, might be to study the factors af- fecting the price of a commodity. Instead of building up the usual time series, extending ever, let us say, a period of fifty.yean, it might be possible to secure data at fifty or a hundred localities dispersed in Space rather than in time. Such series would present a cross-section view of those factors believed to be related to the price of a commodity at a given time. The variations between these series could then be compared and perhaps described by correlation methods. Results of correlation analysis of geographic data may be of slight value, however, because of certain difficulties frequently en~l positions in space". But geographic observations are ordinarily not separated by equal spatial differences and are usually not "successive". In the same way that trends in time obscure the relationships being studied, so many analogous "trends" "due to successive positions in * By Dr. A. G. Black, University of Minnesota. 254 . - Space" obscure the relationships found in geographic data.* This dif- ficulty is the more serious in the latter case because it is practically impossible to remove geographic trends; whereas a number of methods hays been devised to remove trends from time series. If it were possible to secure observations at known distances from some origin, the usual methods of fitting trend lines would still not be applicable because the aesumption that the magnitude of the ob- servations will bear some functional relation to the distance from the origin is not true. An observation taken two hundred miles north of the origin will not be the same as one made two hundred miles south. Differences in transportation facilities, soil, rainfall, temperature, length of growing season, type of population and countless other factors of a similar nature will be somewhat different for each of the infinite number of points equally distant from the origin. Trend in time series is removed to correct for those unmeas- urable and undefinable forces which appear to exert their influences more or less uniformly over a period. It is assumed that these indefinite factors exert an influence that can be expressed by some function of time. Except in the event that observations in space are made successively, it cannot be assumed that any set of forces affects geographic data in such a way that its influence can be stated as a function of dispersion in space. , - It may be objected that even if it were possible to eliminate the "geographic trends", the observations taken at different points would not be useful in economic analysis because the relationships existing be— tween the variables would probably differ from place to place for reasons other than mere dispersion in space. This objection applies equally well to time series. There is no more reason for-believing that successive observations in time are representative of homogeneous conditions than are successive observations in space. The major objective of much time- series analysis, moreover, is to isolate the effect of these very devia- tions of conditions from the trend. In geographic analysis, the maJor objective usually is to establish relationships between the varying geo- graphic factors and some dependent variable. The geographic "trend" may or may not be removed first. In the studies by Russian geographers of the effect of distance from market on prices and production, the pro- cedure has usuallyibeen to remove the "trend" by Setting up a series of concentric zones within which prices and production are assumed to be the same and then explain the variations from this. In the United States, distance is usually included as an independent variable, Just as time often is in time-series analysis. - The attempts to analyze land values by means of correlation analysis illustrate the nature of the problem of geographic analysis. * "—---—--the problems arising from correlation due to successive positions-in-space are exactly similar to those due to successive oocurence in timo----—" "Student",Biometrika, Vol.X. p. 179. 2.6 In these studies, distance from local markets is usually included as one of the variables; but location with respect to distant markets and other points of_influence is usually not included. It is assumed that the individual factors exert an influence proportionate to their magnitude regardless of the location in these latter respects.* Mr. H. A, Wallace has uublished the results of a soidy showing the relationship of certain reasirnble characteristics to land values in Iowa in which not even distance from local markets is considered,** The average land value per acre (without buildings) for each ed: 11,. as reported by the 1935 Federal Census of Agriculture was correlated with county averages of corn yieli per acre, percentage of land in corn, percentage of land in small grains and neroentuge of land not ploweble. A coefficient of multiple ccrrelation of .917 was obtained. ihc study published by Mr. G, C. Hens in 1932, in which he used correlation an- alysis in an sttenyt to secure a formula useful in ascertaining the value of a particular farm in an area, included distance from local market as one var ile.*** The other dependent variables were cost of buildings per acre depreciated to 1919, the date of the study, a land classification index, and a productivity index. The dependent variable was actual sale prices. The coeffiCient of correlation was .81. Inasmuch as the sales were dispersed over a wkolc county, they formed a geographic series. Recently the Minnesota Experiment Station has cor? laced the average value per acre of land and buildings for the counties of the state as reported by the Federal Census of Agriculture for 1925 with the value of buildings per acre,-per cent of land im— proved, a productivity index, per cent of land in woodland, an index of roads, and a pasture index. The coefficient of multiple correla» tion obtained is .97.‘ ‘ If sufficient care is exercised'in selecting the variables which are logically,releted with the dependent variable, and in securing measurements of them, the results of such Correlation analysis should Constitute a worthwhile descriytion of the relationships involved, one that'is pur.5culcriy useful because it attaches definite numerical values to the influence of each variable, A complete correlation analysis, including the partial coefficients, gives u more definite and complete description of the series of data and their inter—relationships than could be possible by ordinary descriptive methods. ‘t one must not let the size of the multiple coefficient lead him into thinking that these six factors constitute a complete description of the structure of land values. To begin with, some very significant factors, such as distance to lOCul market, location with respect to main highways, size and quality of local market,size of farm,conmunity orga.iration, balance out largely in the county averages. These and other similar factors explain the difference between the coefficients of "70 to .80 usually obtained with individual ferns as the units of observation and those of .90 to .98 usually obtained with counties as the units of ob— servation. A coeffi*ie”* of ,“4 wears rmvghly that 44 yer cent of the ?' * That is, "pro'ortionete" within the limits of correlation analysis as expressed »y the coefficients in the regreSSion equation. *‘ Journal of Land and Public Utility Economics, October, 1826. *** Technical Bulletin 9, Minnesota Agricultural Experiment Station,1922. (If. 256 variations must be otherwise related; a coefficient of .94 that only 12 percent must be otherwise related. If these figures are accepted as generalizations of only limited experience, it would appear that the individual farm factors balancing out in county averages repre— sent roughly a third of the total relationships appearing in the data for a given state. Secondly, the influence of the larger geographic trends has not been measured. Thus all the county averages in Iowa are higher than those in states further west, and lower in states to the east, because of relative distance from consuming centers, sup- plies of capital and interest rates, maturity of the regionr and other similar factors taking on the nature of general geographic trends. In the totality of relationships involved in the land value problem, these factors figure in an important way also. An analysis of state averages would throw Some light on the extent of such relay tionships. The problem of geographic trend may be further illustrated by the geographic relationship between corn and big prices. Two price series were constructed for the band of states extending from Maine to California. The farm price of hogs per hundred weight constituted one series, and the farm price of corn the second series. Thb coefficient of correlation was .72. But surely this does not represent a simple direct causeaand-effect relationship existing between the series. It seems rather to show that both corn and hog prices are high or low at certain localities due to distances from the principal areas of pro- duction. In other words, the correlation reflects the influence of geographic "trends". Prices of both hogs and corn are high in the East and in the West and low at the centers of production. The following hypothetical case will serve to illustrate in extreme form how geographic relationships may enter into correlap tion analysis. let us assume that two commodities, A and B, are produced in different localities, but,that A is used in the produc- tion of B. Let us further assume that the production period is short and that there are few alternative uses and few substitutes for A. Under such Conditions the changes in the price of the two goods will be closely related. However, in the locality where A is produced, its price will be relatively low, and the price of B relatively high, because of tran3portation costs. The opposite will be true of the prices in the locality where B is produced. ' If an analysis of the prices of the two commodities was made, it is clear negative correla~ tion would appear, even though, if it were possible to set up two series showing the prices of the goods at the same place over a period of time, without doubt there would be a high positive correlation. The correlation coefficient from the geographic analysis would be of slight service in showing the direct relation of the prices of the two commodities. It would, however, be a useful descriptive device .showing how the prices of the two goods behaved over a wide area. The high negative correlation would tell us that as we moved from one point to another. the prices changed inversely, and that high prices of one commodity was associated with low prices of the other. a, I 257 An extreme case of variations in geographic conditions af- fecting the analysis appears in the relation between percentage of tenancy in the United States and the average value of farm land. It is generally assumed that the rate of tenancy varies directly with the land values. However. the correlation coefficient is only .07. But if thirteen southern states are excluded from the series, the cor- relation for the remaining states is .58. The southern states ex- cluded have a high percentage of tenancy and a low average land value. This is sufficient to offset the opposite relationship found in the northern states. The north and South have an entirely different econ~ omic, social and agricultural history. In consequence, factors affect- ing farm tenancy are quite different in the two regions, or more pre- cisely, the factors affecting the general level of farm tenancy are different. The percentage of tenancy in a given state is determined by two sets of factors. those which set the general level and those which determine the position with respect to the general level. When state average land values and tenancy are Correlated, the series con- sist of obserVations affected principally by factors other than land values which influence the general level. These other conditions differ greatly from state to state. If land values and percentage of tenancy by counties within a_single state are correlated, the same results appear, although con» ditions within a given state are more nearly homegeneous than within the entire country. Illinois in 1920 had an average tenancy rate of 42.7 per cent and an average land value of $187.60 per acre. Mississippi had 66.1 per cent tenancy and an average land value of $43.40 per acre. There is no indication in this Comparison that high land values are associated with a high percentage of tenancy. However, when land values and per cent of tenancy by counties in these states are correlated, the coefficient of correlation is .64 for Mississippi and .83 for Illinois. At least a part of the inconclusive result with the state averages is due to definition. The term "tenancy" connotes an entirely different type of agricultural organization in the two regions. Geo-V graphic analysis can always expect to encounter this type of difficulty. The wheat whose prices in different areas are correlated is far from being the same sort of wheat. The foregoing discusSion would point to the general conclu- sion that the accuracy of correlation as a description of geographic relationships depends upon the uniformity of the data. Geographic dispersion generally results in heterogeneity because the individual observations are largely resultant of different sets of forces. The heterogeneity may be due either to a changing basis of definition or to changes in those factors which bring about the basic levels of the variables being studied. Some of these conditions can be used as variables, or analyzed by cross—tabulation methods, as illustrated in an earlier section. But very often they cannot; or a very considerable preportion of them cannot. Some of them may take sufficiently the form of a continuum so that they may be looked upon as "trends"; but they lack many of the essential elements of time-Series trends. making the handling of such trends by "removing" them from the data, or by considering them as variables, as with time—series trends, a difficult and often dubious procedure. dv" naL. as Multiple The methods of multiple correlation an- Correlation alysis are not generally set forth in much deb Mechanics* tail in the textbooks, and hence need to be discusses in this handbook. The working out of multiple correlation problems looks and sounds hard and puzzling. In reality it is only long and tedious. Or- dinarily one can gain a fair comprehension of statistical technique merely by studying statistical methods. Because of its considerable detail, however, multiple correlation can be learned only by working through all the details of it and discovering the meaning of the re- sults. Many workers using such analysis soon find themselves immersed in these details and abandon their attempt with the firm conviction that multiple correlation is too "complicated" or ”technical" for their purposes. Perhaps it is; but the principles involved in it are few and simple. Once a knowledge of these principles is ingrained, the details of the mechanics fall into their prOper categories and are comprehended without confusion. The emphasis in this discussion will therefore not be so much on details as upon a general understanding of procedures and the reasons for them. Those wanting a complete manual (without reference to theory) of the arithmetic procedure of carrying through a multiple correlation solution, with step by step directions and full illustren tion, including all the shortcuts, will find it in the mimeographed publication issued by the Bureau of Agricultural Economics, entitled, "The Use of Punch Card Equipment in Handling Multiple Correlation Problems." The complete statement of the statistical theory ink valved with briefer treatment of methods is given in a companion report, "Correlation Theory and Method Applied to Agricultural Re- Search". For our first simple illustrative case, let us take three series, Ba x and E, the latter being the dependent variable. For example, u might measure the rainfall, v the sunshine, and x the yield of crops on a number of different farms. All three series are first expressed as deviations from their respective averages, and the deviations set down in three respective columns so that the three values which go tagether for one farm are on the same line. The first statistical objective of multiple correlation is to find two multipliers (more if there er more than two independent variables)-- one for multiplying the values in the u column by, and one for mul~ tiplying the values of v by. These multipliers are the "net regres- sion coefficients". They may be positive or negative in value, or one may be positive and the other negative. After the values of u and v have been multiplied throughout by their respective not re— rgression coefficients and the products of these multiplications set down in two new columns, each pair of products, one for each line, is cross-added and the sums so secured are listed in.a new column headed x', known as the "regression estimates of x".- The objective is to find the particular net regression coefficients which will result in * 3y Bradford B. Smith, Cleveland Trust Company 9., 259 sums (x’ values) most nearly approximating the values of x, or spec- ifically, h‘ving the highest possible correlation with x. Once these coefficients have been found, the first_statistica1 objective of mul- “A tiple correlation-has been obtained,for they define numerically the not linear relatidnswof x to the other two. The next statistical objective of multiple correlation is to determine'actually the Pearsonian coefficient of correlation be- tween the X values and x':Values, between the "estimates" and the "actuals". This is about all there is to multiple correlation processes, per se; the rest is but amplification and development of ways and means of.atteining these two objectives. , The net regression coefficients for multiplying the inde— ' pendent variables by in the process of securing the estimates of x and x' in the preceding, are usually symbblized in statistical literature by the letter 3, with distinguishing subscripts to designate to which variable it belongs: gfiu_* b v a x'. r v . The coefficient of multiple correlation is designated by the capital letter R, to distinguish from the simple or gross correl- ation between two variables, r. ' ' ‘ ' (a) How to find' " ‘ If there were but two forms for which Values of b. - there were measurements of the values of u, - v" v, and x, it_would be a simple matter to find the values of b. Suppose for the first case, u wasee 4, v was —1 and x was-+2; and for the second u was —1, v was A 2 and x was +s7. ‘ It would only be necessary to set up two equations, observation equations, as they are called, which have two unknowns; and.solve,them.simultaneously: u ‘ _. . V-u .. x 4 b rel b - = ‘ 2 ”u “v .+ ‘1 b . 2' b7 = 7 ‘u ‘v + Obviously a value for b Vand a value for 2 could be found u v which would fit the two equations perfectly. When the left—hand sides of the equations were then evaluated, which_will be recognized as the process of determining the val es of x', these results, the regression estimates, would exactly coincide with the Values of x. There would be perfect multiple correlation,.v "‘2 so ObserVation : u : v z x number : Terms in b : Terms in b ' 1 : 1-4 : ~1 : +-2 2 : —1 : +~2 : + 7 3 : —6 : +3 : +4 4 : 7 : -f1 : -6 5 : i:4 : —5 : —7 Sums - : O : O : 0 Means- : O :‘ O : O In the above little table, the data for three more cases are shown. Instead of writing out the Eu and theirV for each of the observation.equations, these symbols are given once at the top. Here are five observation equations with but two unknows, and we are confronted with the necessity of finding values for these two unknowns which will most nearly fit all five equations. 'Obviously it is not going to be possible to determine the values of Eu and Ev by the method of solving simultaneously any two of the equations, for each pair of equations would give different results. Some average result is what is wanted — and the method for it‘is the Method of Least Squares.' From these five observation equations, two "normal" equations are formed, one for each unknown. These normal equations are then solved simultaneously by any method one cares to use and the values of 3 determined. The fonnulation of normal equations follows a definite sequence, which is well worth memorizing: (1) Multiply each observation equation by its coefficient of the first unknown. (2) Sum the resulting equations, which gives the first normal equation. _' . , (3) _Multiply each observation equation through by its coefficient of the second unknown and sum the results, which gives the second norm! equation. (4) Etc.,if there are other unknowns. The first unknown in the example is Ru and its coefficient in the first equation is+~4. Hence the first equation is multiplied through by f 4. The second equation is similarly multiplied through by -l, and so on. The sum of the extensions so secured constitutes the first normal equation. 261 +4 (+41 ~11 = +2 ) = (161 -41 = +8“) u v u V -1 (“1% +21%, = +7 ) = ( 11% ~2gv ._. —'7 ) -5 (—eg 1, at = 1. 4 ) = (353; ~181)_ = —24 ) u 11 V . 1+7 ( 7911+ 1 _; -6 ) -_- (492ll + 72v : _42 ) -4 (—411 ~5g = :7 ) .= (16}; +209 = +28 ) U. V u First normal equation......llagl 7”32v e37 The second unknown is 11v and its coefficient in the first equation is -l. Hence, multiply through by —l, and in analogous fashion for the other equations. —1. (+41)u 41g V \. +2) .(-4§u+lgv . 522) +2 CARI—12% a» +7 ) = (- 21gu +42, = +14 ) +3 (—6pu+3p_v -.- +4 ) ... (—1813L1 fegv = +12 ) +1 (+72u+1p_v = ~6 ) = (+ 71111 +112V = — 6 ) —5 (~43u ~5gv , :7 ) .. (+203u +25%; 3 +35 ) +53 Second normal equation. . . . . .. st. 3]; +40h u v When these two equations are solved simultaneously, Eu is found to equal -.3478, and Ev to equal 1.3509, which are the two net regression coefficients which were to be found. A moment‘s consideration of these normal equations shows that the coefficient of b in the first one, 118, is nothing but the sum of the squares of the u values. The coefficient of};If 'in this first equation. 4-3, is the sum of the products of corresponding values of u and v, which is identical with the coefficient of bu in the second ehation. Some duplicate work_ has therfone been done -- to .‘be avoided in the future. The coefficient of bv .in the second normal is the sum of the squares of the v values. vThe terms on the right of the equality signs in the normal equations are in one ease the sum of the products of corresponding values of u and x, and in the other of v 262 and x. The normal equations might therefore be expressed algebraic— ally: ; .Epu 8112) i— €31". gm) z. 8.1.) . uv / 2 ‘ v H 20 Q 7 T t z“) 6 x) The arithmetic values sought for use in the normal equa- tions are thus gin“), éfiux) , éévz) , éflvx). A systematic scheme for securing these is illustrate in the following table: Observation. : Data : Extensions : numbers : u v x : U ux ux r v vx : x2 1 :T4 -1 'T2’ :+16 —4 8 :1—1 7 2:#’4 2 : —1 +2 +7 :+ l - 2 —7 :fi-4 f14:+49 3 : —6 -+3 'T4 :-+35 ~18 -24 : fi—Q -fl2:-f16 4 :1)? 1-1 -6 :+49 1- 7 ~42 : +1 - 6:146 5 : ~4 -5 ,7 :i—lG 1&20 2 : —f55:«r49 Sums : O O O :illB 1'3 —37 :~+4O -f53: 154 If there were more independent variables than u and v, there would need to be more columns, one for the square of each of these variables, and others for all combinations of it with the other Variables. A column is also given for the squares of x. When it comes to correlating x with x‘ to determine the coefficient of multiple correlation, it will be necessary to compute the stan- dard deviation of x and the necessary squaring of the deviations might as well be done at this time. (b) Finding the Coefficient To determine the x' values, of Multiple Correlation the regression estimates of x, it (Preliminary) . - will be remembered_that u and v are to be multiplied by their re— spective regression coefficients and associated products added together. This is illustrated in first part of the following table: Numbers : —.34'78u £14.35on E Sum Q x :Resid— E .23 Q xx' :"x'2 z : : g x' : :uel. : z : : _ r : : :x—x ~Z : z : 1 E ~1.39 S «1.35 $-2.74 9+2 i 4.74 5 22.47 E — 5.485 7.51 2 z .35 : 2.70 ”3.05 g+7 : 3.95 : 15.60 : 21.35; 9.30 3 : 2.08 : 4.05 :+6.13 : +4 : ~2.13 : 4.54 : 24.52:37.58 ,4 : —2.43 : +1.35 :—1.08 : -6': -4.92 : 24.21 2 + 6.48: 1.17 5 : -+1.39 : —6.75 :—5.36 : -7 : -1.64 : 2.69 : +»37.52:28.29 Sums E o E 0 Q o E o E o 5 69.42 E +84.39E84.29 263 '_ The table/ghggs the differences between,theiactual and es- timated values of x, x - x', designated by the letter Z, and called "residuals". The standard deviation of Z is the standard error of estimating x from u and v by the,method of least squares. The cri- terion of least squares is that this standard deviation of Z shall be a minimum. The sum of the residuals is naturally zero, since the residuals are differences between two series which sum to zero: 1 by definition, and x‘ since it is obtained through multiplying by sons stants two other series summing to zero by definition, u and v. Since the sum of the residuals (and consequently the mean) is zero, the ac- tual items are deviations from mean and the standard deviation may be quickly determined by taking the mean square of them and then finding the root. The residuals are here mentioned and their computation shown because they become highly important in curvilinear multiple correlation processes, and because the standard error of estimate is the best single measure of how good a forecasting formula, or regres- sion equation, actually is. The coefficient of multiple correlation, it is remembered, ' is but the ordinary Correlation coefficient of x and x'. To determine it, three other values are needed, the standard deviations of x and x', and the product moment of x with x'. The products of x and x' and the squares of x' are shown in the two last columns of the preceding table. Note that their sums are practically identical. They would be exactly identical if the arithmetic throughout had been carried far enough. The value of 20:2) taken from an earlier table is 154. r , is ac- xx cordingiy . 84/n \3? 84/ n) (1154/11) Since n will cancel out (n is the number of items, 5 in this case) r a 84 .___.,__,__. - $.- = '74 xx' {Elf—(154 154 The coefficient of multiple correlation in this case .74, usually symbelized by the letter, R, is commonly written with sub. scripts to specify the variables involved, the dependent variable' being given first, and separated from the others by a dot: thus — Rx.uv I 0.74. This coefficient is always positive in sin. and ranges from O to 1.00 in value. Another interesting thing to note at this time is that the sum of the squared residuals, 69.51 in this case, rounded-to 70, plus the sum of the squared estimates, equals the sum of the squares of x; i. e., 70 #34 =- 154, " ‘In the preceding pages has been given all that is actually necessary to>the working out of a problem in multiple correlation. The dexcription of method has followed the meaning of correlation as clasely as possible. In actual practise, short—cuts are used which 264 reduce the labor to a fraction. It is advisable, however, for those} who have never worked out a multiple'correlation problem before to go through the steps Just outlined in order to understand the fundamental nature of the Operations. It will also give them a more than healthy respect for the shortcuts described in the remainder of this chapter,- and stimulate their desire to master them. (c) Labor Saving There are six important way of reducing Methods. . the labor of multiple correlation computations, as follows: (1) we have assumed hitherto that the original data were first expressed as de— viations from their respective means. The process of securing these deviations - often a lengthy one - can be eliminated entirely. ‘ (2) The difficulty of multiplying in securing the extensions of the original data.may be greatly reduced by simplifying the original data. This process is called "coding". (3) In combination with coded data, punch card tabulating equipment may be employed to reduce the number of multiplications to secure any product sum to approximately 20 or 30. This is perhaps the greatest labor saver of all, especially when the number of obser- vations is over a hundred. (4) The late discovery of early errors and the labor econ- sequently entailed in repeating all subsequent computations can be prevented by introducing at the beginning of the process a check sum which will check the arithmetic of all steps through the solution of_ the normal equations. ' 1‘ , (5) Owing to the symmetrical nature of the normal equations, a specific type of solution for them will result in a minimum amount of labor. ' This is the Doolittle method. (6) It has been assumed heretofore that it was necessary to compute the values of x' in order to correlate them with x to secure the multiple correlation coefficient. To determine R, this whole process of computing the estimates may be bridged over and eliminated entirely. EliminatingiDevigtions from Averggg. A consideration of the coefficients of the unknowns_1n the normal equations shows that these are all of the form.€L(u2) orijuv) are deviations from their respective means. If the original-series. are denoted by capital letters, U and V, the sum of the squared de— viations, (u3), may be obtained by squaring and obtaining the sum of the original values, (U3), and subtracting therefrom.the product of the mean and the sum of the original series. ,' phat is, {(113) =_ gm?) .. Mu in L . 265 If a form were set up it would be something like this: U U2 6 36 4 1e 8 64 2 4 Sums O 120 Mean 5 Subtract Mean times Sum (5 x20)..,1oo (uz) 20 This gives the same result as actually finding the deviations and summing their squares as below, but with less labor. U - M U~M 113 . = u . e. 5 , +1 1 4, 5 _1 1 8’ 5 +3 9 2' 5 -3 9 Sums 20 .. o 20 Mean 5 A corresponding plan secures product sums like éibrv) withp out first finding the deviations, by u.eing the formula: fin”) = .EXUV) — Mu_gy The product sum of deviations from average may be determined by multiplying the original two series together, summing, and subtract- ing the product of the mean of either one times the sum of the other original series. A form of computation and its comparison might look like this: . . \ U V ‘UV ) (U - 5 V — 6 uv ) U“ u =. V 6 8 , 48 ) ( +1 +2 +2 4 4 - 16 ) of_ ( -1 -2 +2 8 3 24 ) with ( +3 —3 -9 2 9 18 ) , ( -3 +3 —9 Sums 20 24 106 . O ' O —14 Means 5 6 -(5'24) or (6.30) Coco-0.120 {(uv)oclccauop-l4 5" 1’ 866 Incidentally it should be pointed out that this is ap- propriate method for the computation of standard deviations and pro-' duct moments for simple correlation coefficients. Following is a form set up for recording the computations of all the sums of squared deviations and product sums needed for a multiple correlation problem with independent variables A,B,.....N. and dependent variable X. Form for preparation of normal equationsr ‘Explanation :Line ' A g B : : N : x d Check sum 3 3' i f i i ‘ S Sums : ‘givA ; é;B : nEN : EEJC 1 2i 5 . , . . . ‘ . I Me ans . i la . Mb : . Mn . MI 3 MS Extension of A :apl at?) his) ‘ lat-m) :' aux) ‘g 205°) n Ma times sums :a—2 gMaEA .M‘fiB : main : Mafix “ MpiN , Line (a~1)-(a-2):a—3 3235) 3-1372) : Him): item) alas} Extension of B :o_1 I! z (33‘) : : {ll-I): (BX) (BS) "Mb times sums :b~2 1! as B : :1: N : M x I M S Line (b—l) - (b—z):b..s : : (b2) : : 13m): 18m) . has) ' : i : ‘ : : : ; ! ; ; ; . Extension of N :n—l i : : F‘(Nz :Ei05{) SL(NS) V"Mn times sums :n—2 ' : : :MfiZQ : Ngix Mn£§ Line (n—l)~(n—2) :n-S : L7 :fijn‘): €L(§x) attic) Extension of X :x—l : : : : 2;{A~) éJXS " Mx times sums :x-Z l‘ : : : 3 Mfiix Msifi Line (xel)-(x-2) :x—3 { : : : : 21X?) ELKXS) This form is explained in detail in the previously mentioned publications, but is really fairly self-explanatory in the light of what has just been discussed. When each space has been filled in with the data designated by the symbols, the various values for the normal equa- tions have been determined without recourse to the actual computations of deviations from average! - The meaning of the check sum column will be discuSSed later. Coding. Even when product sums and summed squares are computed by the method outlined in the preceding section, there is ordinarily a great deal of tedious multiplication work to be accomplished. To obtain -4Xuv), for example, it is necessary to multiply each value of U by the corresponding value of V. This multiplication labor can often be reduced by "coding" the original values. ‘ Thus, if the values of U range from 1024 to 1088, from each figure may be sub— tracted 1020; the series would then range from 4 to 68. The results as far as the correlation is concerned are identical, for correlation is concerned with variation from average value. Adding or subtract— ing a constant from a series in no wise affects its correlation with or regression on another series. The correlation and regression co- efficients are unchanged by the process. Following the same principle, the series can be still fur- ther reduced by dividing by 3 and dropping the fractions. The series will then range from 1 to 22. This will not affect the correlation coefficient materially, but the regression coefficient will be raised three times, which must be remembered when the final results are "de- coded" and prepared for presentation. Dropping the fractions in- troduces an error which. when the divisor is 3, may be as much as l _and l/2 units of the original series, or .5 of a unit of the coded series; but ordinarily if the coding is such as to include nine-tenths of the original items in a range of 20 to 30 coded units, no signifi- cant differences Will result. Hence the simplification of series by coding prior to cor- relating will introduce significant errors only when there is either high intercorrelation between the independents or an exceedingly high multiple correlation coefficient. Coding is therefore recommended as one of the greatest labor savers in multiple correlation work. It speeds up the work tremendously when punch card equipment is employed as described in the next section; but it also helps greatly where computing machines alone are employed. It is indispensable in those laboratories where no multiplying machines are used, the clerks quickly learning their multiplication tables up to 20 or 30. When the series have all been coded, correlation processes proceed precisely as they-would were the original series employed. The fact that they hays been coded may be forgotten entirely until the very last thing when the results are "do-coded" for presentation purposes. Punch-Card Eguippent. Punch-card equipment may be used to expedite the computation of the values to put in the normal equations; specifically, to secure “the.figpres that go on the lines of designation ending in "1" in the 368 ‘ 3 fonn given earlier. Such eouipment begins to says time as soon as there are a hundred observations; and sayes it at an increasing rate as the number of observations increase. One card for each observa- tion is punched with the coded data for the several variables in their designated locations on the card. The principle involved in this re- duction of labor is quite simple: Suppose that we are attempting to secure the sum of_the squares of the A items. Undoubtedly there is a large number of items in the A series with like values. It is easier to square a given value once and then multiply it by the num- ber of times it appears, than to square it over and over again and add the results. This is what the punch card enables us to do, The cards are sorted in the sorting machine into as many packs as there are values of A. There will be, Say 30 packs, if the data have been coded. In the first pack, the value of'A on each card will be 1, in the second 2, etc. we have but to square the value of A and multiply it by the number of cards in the pack, for each pack, and then add the products so secured,to obtain the Sum of the squares of the A values. Or more simply yet, the cards in each pack may be taken over to the tabulating machine and the A values summed, and the sum so secured multiplied by the value of A for the given pack to get the same set of products to be added together. While the cards are in packs of constant A value, we may also add the B values for each pack. These sums are then multiplied by the associated A value and the products added together to give (A3). This process is but the grouping together of all the E values which have a constant value of A associated with them; and then instead of multiplying each 3 value separately by the A value and adding, the B ' values are added together first and then multiplied by the given A value-~a perfectly legitimate process as a moment's consideration will show. By this same process, all the extensions can be quickly ac- complished. The labor of weeks can be reduced to a matter of hours. Anyone who undertakes this type of task will have little difficulty in working out convenient forms and systematic procedures; or one may refer to the publication mentioned earlier in this chapter and there find the forms already prepared. . The Gheck SumI _After the data have been coded and listed down in columns, side by side, a final column should be added, headed "Check Sum", or "S". 'The value listed in this column on any particular line is the sum of the independent values and the dependent which are listed on 239 _ the same line. For example,_ in the illustration below. 21 3+8 + 8 + 7. v U v x Cheek Sum S 21 14 19 13 67 16.8 ‘ . v Sum Means (”BNCDIhOI mpomusoo .cnozmoo'mxz G) . From the time of its first listing onward, the check sum is treated as though it,were one of the Variables in the problem except that its stan- dard deviation need not be computed. The first function of the check sum is tooheck the addition of the series. for the sum of the sums of the series should equal the sum of the check sums: 30 + 24 + 23 8 6'7. The next function is to check the computation of the means: 5 + 5 + 5.8 '5 16. 8. If the sums. howeverJ ar divided he wron number of items. the check sum will not show it. Care should therefore be exercised at this point. ' 1 The next function of the check sum is to check all the ex- tensions. This is done by mltiplying U. for‘example, by the cheek. sum awhen it is multiplied by itself and the other variables. as in the' following:- . ' U2 uv UX us U v x s z : v2 yr vs :etc. 5 e 7 21 : 35 49 42 .125 : e4 . .56 '168 : 4 4 e 14 : 15 1s 24 56 1 16 - 24 55 : e 3” 8'., 19 4 54 24 54 .152 : 9 24 57 : 2 9 2 13 z 4 ‘ 18 ‘4 26 : 91 18 117 : Sums.......... : :170 . 122. 399 x 120 106 134 360 ' Then the sum of the products of any U value times itself and the cor- responding values of the other variables” is equa‘ to..U timesthe choc}? sum. for this sum is only the sum of these veryivariables. Thus, 36 + 48 + 42 = 126.- When the extensions are added, the totals should check, therefore, and the extensions are thus proved: 120 «r 106 +_ 134 a 360. ’ ‘When V is extended it is necessary to go back and pick up a previous extension to check to the VS. total. namely. £(Uv). Thus me+4mo+1% - we _ (um +fiv4)+ggvx)= {necessa‘y in hand— ling all historical materials, it is obvious that the more r cent the ‘material the lees difficult and technical are the problems 0 external 'criticism, or for that matter or internal criticism. Coxpare for in- stance, the difference in the nroblem or a historian dealing :ith sources of the period from 1850 to 1860 with the task of a historian tho is trying to ascertain the facts of the period from.1250 to 1260. For the latter period his sources will be conparativel; ion, they will be reproduced by hand rather than printed, and Lrohably in medieval Latin. They must be carefully compared with other known doc ments to determine genuineness. A minute knowledge of paleographf‘nocetsary‘to determine authorship: and a comprehensive knowledge of the literature and conditions of the period in order to determine place of corpoaition, and cirCtxstances affecting the attitude of the author. in echrt knovlcdge of contemporary docua ments m y be requisite to determine dependence or independence. The question of the real date oi composition, as distinguished from the nondnal date, may need to be determined by a nirnte applicatiOn of the principles of philology. 1. e f It is obvious that much'or this \ill be unnecessary in the handling of recent historical data. "The technical apprenticeship is relatively short and easy for those rho occupy themselves with modern or contemporary history, long and laborious for those who occupy them_ selves with ancient and medieval history."* Sometimes there is even a tendency of historians to modify the relative importance of criticism as compared with other methods of historical research. '"It is quite possible, Whatever may be said, to have the historical sense in full measure without having even, both literally and figuratively, wiped away the dust from original documents.- that is, without having dis- covered and restored them for oneself."** In fact the synthetical historian may well afford to leave the tasks of external criticism to * Langlois, Charles V. and Seignobos, Charles, Introduction to His- torical Study. Translated by Berry (London. 1912) p. 55. ** Ibid., p. 114 ' 348 st in V“" ' ' ”" of We: the specialist in that field. The two types of work frequently re— quire different temperaments, aptitudes, and training not often com- bined in the same individual. Critical scholarship for instance, ”may come to lack perspective and be characterized by excessive pre- occupation with little things.“* This may lead to hypercriticism, fear of publishing perfectly valid data, and a tendency to lose sight of the large historical objectives. The synthetical historian, es— pecially if he be dealing pith a recent period, is likely to find the genuineness and localization of his sources already determined in the caSe of the great majority of Cases, and therefore can Spare himself much of the drudgery of external criticism; he canrnt avoid the ob— ligation of concerning himself with such critical tasks as determining dependence or independence, the author!s facility for making observa— tions, absence of bias, the ability of the author to observe and report accurately, and the relation of various sources to the establishment of the facts. Lhile a long step forward has been taken when historical facts are establiShed, there yet remains the important task of determining the meaning of the facts, that is, their relationship to one another. This task, together with the presentation of the facts is the field of his— torical synthesis.*t Relationships, of course, may themselves be questions of fact. For instance, the occurrence of a great flood in a given year may be established as a fact. The subseouent rise of price of a given product grown in that area is another fact. The reSponsibility of the flood for the rise of price is a relationship which may be established as a fact, This might involve the assenbling of many other facts, and the establishment of other relationships negative or positive. Students of price analysis will appreciate how many other factors might have to be allowed for such as total Size of crop, other factors affecting it, carry—over; demand conditions, etc., before the influence of the flood on price could be determined, and particularly before the extent of influence may be measured. In the great majority of cases, however, relationships can- not be incontestably proved - that is, established as facts. At best, we can determine a probable qualitative relationship, and an argument may be built up to support this probability. Quantitative relation— ships may be established having a greater degree of probability pro- vided the raw materials are available on all the different factors that must be alloWed for in establishing relationships. However, the farther we go into the past the more it is likely to be necessary * Ibid., p. 129 , ** It is obvious that that the historians call synthesis includes a combination of analysis, as this terms is used elseuhere in the handbook, and synthesis. 349 to depend on inference from partially knovn facts. It is most import— ant that the historical student discriminate between definitely proved and only partly proved relationships and particularly mere congectures and working hypotheses. Failure to be thus discriminating has re— sulted in much bad, though possibly, entertaining history. Methods of logic, of course, are available as a means of testing the proceSs of reasoning, but then quantitative measurements are not applicable the relative weight and inportance of Various facts in a relationship must depend largely on opinion. This fact, as well as the continual discove y of new materials, is responsible for the necessity of continually reqriting histomy. There are almost innumera 1e kinds of historical syntheses. Many of the earliest were mere chronologies relating events in the order of occurence. Evan in these crude chronologies there was ev— idently a very large element of selection since all the experiences and happenings of human life were not included. As already indicated, many of the earlier syntheses constituted selections of important milia tary and political events. Sometimes a dynasty or a prominent individual Was the canter'of interest. , A very comnon synthetic unit even in modern historiCal writing is gBOgraphic, as, for instance, g,nation, a State, a county or a city. Usually this geographic unit may he further defined by limiting the synthesis to a period. The synthesis may include many tOpical phases of the geographic unit and period selected, such as political, military, diplomatic, econoadc, cientific, literary, and social history, or it may be limited to one phase, as for instance, economic life, or to a still narrower topic, such as the economic history of agriculture, or even of a partiCular crop or kind of livestock. It follows from this that a synthesis may be wide or narrow in scope. It may be world—wide (Zeitgeschiche), or nation—wide, or it may be limited to a toxnship; it may conprise a brief period or a long one; it may be limited to a single phase of hunan life or com— prise as many phases as the historian.considers significant. Both types have their place. The broader and longer syntheses permit the determination of the wider relationships and sequences. The syntheses of narrower scope make possible the consideration of many facts and the determination of many of the more minute relationships that would necessarily be omitted in the former type. The fiselection of a topic will depend not only on the in— terest but also on the eXperience and aptitudes of the worker, for the broader syntheses, other things equal, are the most difficult. It is well also to consider carefully whether the subject needs in— vestigation because it has not hitherto been investigated or in- vestigated inadequately, because new facts hays become available, or perhaps because it appears that a new and more illuminating his— torical synthesis may be achieved. ._..I --—~‘_.‘.... 1 ’ I. , t 35C ' It is obvious that the nature of the historical problem and the methods employed zill differ greatly in_accordance ride the scope of the synthesis. For instance, there will'be a difference in the completeness with which available sources are consulted and in the extent of dependence on secondary sources. No man in a lifetime could consult all the sources of American history. It is doubtful if he could consult all the available Sources for even a limited period such as the quarter of a century following the Civil War. For such a period it is a question not of consulting all the sources, for in a" sense every private letter, every pamphlet, every newspaper, and every government.document is a potential source of more or less significance. Clearly the problem that presents itself is one of selection, not only of the worththile sources, but also of the worthwhile facts within the sources. Ehxthermore, in the case of the wider syntheses the historian will be compelled to rely to a considerable extent on sound secondary sources, particularly scholarly monographs of limited scepe. It is true, no historian ean feel perfectly safe in adopting the results of another's work,."but so complex a science as history, where facts must ordinarily be acounulatcd Ly the_million before it is possible to form- ulate conclusions, cannot be built up on this principle of continually beginning afresh. In order that science may advanbe it is necessary to combine mhe results of thousands of detail researches."* It has been well said that we should be carefu ‘to avoid the fault of those who"heed only the unprinted aid scorn the rest, resembling those tourists who have no cease until they have got access to some private gallery, but merely glance at the public ones; there the best pictures may happen to be.** Nevertheless, he who employs secondary sources is under even greater obligation to apply critical methods than in the case of primary sources. I In every historical synthesis there are certain facts that are required as background; they do not.constitute Lhat may be called the "contribution” or essential core of the synthesis. For such facts particularly, it ray not be necessary to delve into Original sources if sound secondary sources are available. ' The more essential problems of historical synthesis, have been summarized as follou::*t* (l) to set bounds to the subject; (2) to-divide it into periods; £3) to decide what facts are to form part of the synthesis.and that are to be discarded, (4) to_determine - what causal connection exists between the parts (5) to decide What change has been effected by the historic action (6) to determine "hat * Langtonls and Seignobos, Introduction, etc. pave 280, Cf. also filing, F.l. .The uriting'of History, page 88. ' ’ ** Jusserand in The Zriting of History, p. 19 *** Fling, 3.15., The 'Cfriting of History, p. .127 .351 parts shall be emphasized and what touched upon but lightly, (7) to determine hon many data shall be used for the sake of color to pro- duce terisimilitude in the reproduction of the past. The scope of this article does not admit of a detailed dis- cussion of these and other problems of historical synthesis. But it would not be proper to close this~discussion without a turd on that very essential phase of synthesis, namely, the adequate exposition of results; There is a perennial conflict of Opinion as to the extent to which historical writing can be sound and still be so attractive as to be widely read. The growing unpoPularity of history as literature for general readers recently led the inerican Historical Association to appoint a committee to inquire into the causes of the tendency. In its report the coundttee todk an optimistic view of the possibilities of improvement in this regard, and the Chairman, Ambassador Jusserand, expressed the View that "an historian who uses so dull a style that he will not be read is as useless as a painter Who should use invisible colors. He is, moreover, sure not to do justice to realities, thus swerving from truths, for realities are not dull."* There is obviously no reason for slovenly English or crude methods of expression in presenting the results of historical investiga- tion. It does not follow, however, that confermity to high standards in these regards, While still obserVing scientific standards, will re- sult in a "popular” product. It is well to have popular histories provided they are accurate, but to present adequately the results of original investigation necessitates exhibiting for critical examination by others of the essential scaffolding on which the conclusions rest. This involves a generous use of footnotes, and the inclusion in the text of many facts which, introduced for purposes of confirmation, may prove dull reading for all but critical students concerned with'the establishment of historical truth. Professor Fling has well Said, “It is no more reasonable to eXpect that all historical work should be written for the general reader than that all works on natural science should be accessible to the same class of readers."** * Jusserand and others, The Writing of History, page 5 ** The Writing'of History, p; 158 ‘ 352 ‘ BART EQUR THE PRESENTATION AND UTILIZATION OF RESULTS. A. The Use of Tables for Presenting Results. Well set-up tables are an effective measure of presenting large masses of information in small space. If properly constructed, they reveal on their faces the relationships which have been developed by the analysis. A well designed table will convey for more informa- tiOn than can be given in an equal Space of text. Tables are, in addition, the only satisfactory way of presenting the details of data whichfall naturally or otherwise into classes of pretty much the same kind of. thing. _ y The 'form and Construction of a table depends in part upon the purpose it is to serve. Tables may roughly be clasaified into: (a) Storage tables,'in which large masses of date. are arranged in convenient form for reference or to be worked from. The large tables in the body of the census volumes are examples of such use of tables; also the tables in ep- pendices of bulletins. (1)) Analytical or demonstration tables, whose purpose is to reveal relationships that have been deve10ped and provide the principal supporting data. Most tables in the main body of bulletins, and the tables in the introductory portions of the census volumes, are examples of this use of tables. (c) Display tables, serving very nearly the same purpose as cartogrems. The two latter types of tables have a definite story to tell, the demonstration table emphasizing the clearness and accuracy of the presentation of it, the display tables emphasizing the effect- iveness and attention—Value of the presentation. The titles and headings of both should effectively suggest the story of the table, of demonstration tables, the exact content in addition. The textbooks in statistics nearly all present the technique of table-making. It has therefore not beondeemed necessary to dis- cuss the subalect systematically in the handbook. Instead, Mr. S. W. Mendum, iho for five long years has been editing tables prepared in the Bureau of Agriculture Economics, has been asked to summarize his con- clusions as to procedure and practice in preparing tables for publica- tion. His discussion refers to storage and demonstration tables, not to display tables. 353 Preparation of Tables for Publication.* General V Tables should tell rather definite comments. stories. The story to be told dictates the form and Complexity of the table, and the relation of the tabular matter to the‘hext in some measure affects the minor details. tables The form of the statisticalAis more or less constricted by the information available, the anticipated needs of users and the size of the medium of publication. Tables should be as simple as possible, but simplicity is often very illogically de- fined. Only so much information should be thrown into a table as is definitely related to the point to be made. It is a mistake to load anything into a table because there is space on the sheet for more lines or more columns, and it is still more of a mistake to leave out data essential to proper understanding of the story the table is sXpected to tell. If a table has Very few figures, it probably would be best to cast it into text. The simplicity of a table is more clearly measured by the care taken in its preparation than in the aflbunt of white space it Shows. The blackest looking table may be of the simplest type; because it tells its story without possibility of confusion. Thus the worst appearing tables in the Yearbook of Agriculture are the tables of temperature(and like tables); yet there is none more simple. Many another relatively open table is much more complex. The two greatest weaknOSSes of tables as submitted for publication are their order of arrangement of the data and their use of symbols and abbreviations. The most natural and convenient procedure in assembling the data needed for a table may not be all satisfactory for the presentation. It is too much to eXpect that the reader will be versed enough in the matter presented to be able to interpret exactly and promptly all the abbreviations and short— cut eXpressions and other symbols an author may find it convenient to use on his work—sheets. These have a habit of persisting in the printed publication. Procedure in No definite statements can be made Table Construction as to the best procedure always to follow in setting up a table; but there are a number, which if used with discretion, will help considerably: 1. Plan the tables for the specific purpose in hand. Do not expect to use the same arrangements as on the work-sheets. The nature of research is such that the *By S. E. Mendum, Bureau of Agriculuiral Economics. 1 4. 5. order of doing Wonk is often quite different from the most effective order of presentation. Pay more attention to the relationships of the matter in a table than to the form and size. Later on some one will raise the troublesome questions incidental to printing large—size tables. In general the printer can and will print soccptably any table requested. Test each figure.‘ Tables are made up of columns and lines duly labelled. ~The reader has the right to expect and assume that any figure in the table is correctly described by both the boxheading at the top of the column in Which it stands, and the item in the reading column, or stub, of the line on which it stands. Figures that do not meet this test exactly should be individually further described or limited by footnotes. If the footnotes become too numerous for Convenience, it is likely that the sorting has not been so close as would be desirable and more lines ' or more celumns, perhaps more tables, are indicated. If this test is applied figure by figure, con- scientously, fundamental faults in tabular matter will be avoided in the.early stages of the work. The wisdom of the_groupings is likely to be challenged by the first half—dozen figures tabulated, or if the groupings de- cided on are correct, the descriptions may need to be changed. The accuracy of corp uted or derived figures must be unquestionable. Test the symbols and abbreviations. The use of unfamiliar symbols is confusing ~ the bad effect on rec.ders often off— sets the Savings achieved through condensation. There are, however, many‘practices which may well be standardized so that they will have the effect of faunliar symbols and thus relieve tabular matter of all unnecessary effort on the reader‘s part. Do not try to make your table so all—comprehensive that one table will serve a dozen purposes. Such tables will satisfy no one in particular and will draw more comments unfavorable to your judgment than they will excite praise for your cleverness. Think twice before you use a table after it is made.’ This is especially true of the most simple and the most complex. If the main object is illustrativenor argumentative, it is quite possible that a text statement will serve much better. ' 355 EWQ As in a graph, the independent variable EEQQEngfi. is always plotted with respect to the hori- zontal or x-axis, the dependent variable with reSpect to the vertical or y—axis, so in tables ' the results are most satisfactory when the independent variables, the items described or measured, are listed in the reading column or stub, and the description, basis of measurement, or characteristic on which sorting is made, is placed in the column headings and title. long narrow tables (one or two columns of figures) can be doubled up for economical printing, and wide tables (of few lines) can be broken over without loss of effectiveness, the stub being repeated. There is somewhat more advantage in uniformity than appears at first glance. Individual tables may be as serviceable built hori—j zontally as though built vertically, but miscellaneous shifting of fonm in a publication without abundant reason is a weakness that should be avoided. In a publication containing numbered tables, it is usually best not to try to insert unnumbered or "text" tables9 Data for such tables may be presented as text or as part of an appropriate numbered table to Which reference may be made. The adequacy of the title to the table, or the difficulty of preparing such a title, are serviceable h;diCators of faults in construction of tables, for the title should describe the table comp pletely and accurately, and if, together with the column and stub- headings, it does not do so readily, probably the table needs re— planning. Titles descrve more attention than they usually get from authors or editors. Some authors seem to be Satisfied with a general topical title. Titles should be brief, but the additional requirement that they be also clear, precludes general use of the topical title alone. The key word or phrase followed by a colon makes a good start. Thenthe specifications of the table can be enumerated as far as necessary. Such titles may seem unnecessarily stiff and formal at times, especially when there is a long series of tables on the same subject. Judgment and moderation is needed in this as in all things. Where the general subject is mechanically subdivided by sorting and grouping, the attention and interest of the reader be- comes directed more toward one of the specifications or factors in the table than to the general subject. Thus, in a publiCation dealing with crop production in the South, the titles to tables dealing with cotton alone might well begin with the woni”cotton" 356 as "Cotton ; acreage under cultiVation July 1, by states, 1913-1926." If all ten cotton tables were assembledin'one section, however, the title to the same table might properly read "Acreage "of cotton under cultivation July 1, by states 1913—1926." The more complicated the tables must be, the more urgent the formal plan becomes, and then uniformity of appearance dictates that the simple table also be cast in similar mold. Avoid subtitles as far as possible. If the informtion is necessary, it should be in the title; if not, it should be cut out, as it is likely to be lost anyhow. The words "weekly" "monthly" and "yearly" can often be dispensed with to advantage. Use of capital letters to indicate importance of the-nerd or phrase capitalized as a substitute for care in writing a title,'is a confession of weakness. ' Dates indiCating periods covered should usually be placed at the end of the ti tle. For printer's convenience,‘ tables occupying more than one » sheet should bear the notation "Continued" in the lower ’righthand corner, and again at the end of the title at the top of the suceed- ing sheets. Dates, especially tho se designating periods covered by averages and usually hyphenated, are the bane of the careful stap- tistician. He can not tell whether they are inclusive or no t. Uniformity of practice is more important than any given practice. The practice which will help most is a general agreement that the . hyphen or "to" between dates shall alvays mean that the dates shown are included. Thus "190294.913" should be used as covering the five full Calendar years beginning Jan. 1, 1909, to and including Dec. 31, 1913. An average ‘made up of figures beginning as of July 1 should be identified as fiscal years, or as "Years beginning July 1," or as "Years ending June 30", or "July, 1909 —‘ June 1913.” An average of five crep-years beginning July 1, 1909 would be desig- nated as the "Five—year average, 1909-40 to 1913—14" using the word "to", still in the inclusive sense,1n ' inpreferen ce to the hyphen. The form "1909-13" Should be discarded for a five-year p eriod. Every table should have a date or dates somewhere. Size groupings in the saris series should also be always mutually exclusive. ' 3b? Reading columns or stubs usually give little trouble. The items must be definite concepts or the table trill be confused and obscure in use. ' Sore classification my be given effect by pro- gressive indentations; each subsequent class indented the same dis- tense) and by center headings in one or more sizes of types. Hor- izontal rules do not cross reading columis in the best practice. (At times two or more tables with the same boxhoadings may be 8.3-- sembled on the sane pge Without repeating boxes vvhen‘space- is a consideration. ) » ' Reading columns may or may not include the units to nhich the figure to the right apply. Such inclusion is particularly serviceable when the table shch prices, costs, or values" in dollars for a. variety of items. Even then,‘ a Column showing designated. units is prefera'ole.. ' Box headings are the authors means of indicating relation- ships. As thetable is read from‘lef to right, the earliest dates in time series should be on the left., Distributions are indicated by subdivision of the box from top to bottom. Three or four sub- grouping is about the-limit before the question of‘dif‘ferent tables or differentrstatement'of the title arises. ' I i l - Box headings must usially be short'but they mist be def- inite orrade definite by (annotation. ,' The reader expects. to, find in the column Only those items‘rfnich are accurately described by the contents of the boxheegdings above it. Readers do not refer to the box head, rating it a part of the reading of the figures, often enough, There is‘ground for the opinion that they have found the practice of so little help that they have got out of the habit. A practice'desirable on the whole, though oniCWard at times, is the placing of the wait deSignation above the..figures in each and every column. The box heads are relieved of the burden of carrying the unit; but the chief advantage is that with- the unit. so placed. there never can be any doubt as to the unit. It serves the writer as a constant warnin not to mix units. When the units are all dollgrg or all agree. or counts it is'h'ard to see why the unit designation is needcdother than in the title. Uniform practice is the only reason for'it. Such words usually ' do not often'teke Space needed. The conventional dollar sign ($) does, of course. Units should be printed in italics. Punctuation should be-limited to' the essential. Box head desigiations-fOr marking columns should be in the singular number. - v , 558 Totals and averages logically appear at the bottom or right of the detail which they summarize. Sometimes totals are placed at the top of the figure columns, or sections of it, but the net gain is negligible. Totals are placed before average. In a summary line containing totals and averages, the situ» ation is amply covered by the use of ”tetal" only in the reading column, eSpecially in those cases where the box above the average indicates a column devoted to averages. There is disagreement in the matter of checking tables, especially tables including derived figures or series of related tables. The scientist is tempted to record the individual item correctly and let the total come out as it may. The practical man raises the question of how he is to tell ”what is what" when dis~ crepancies are apparent. Tto total figures of the Same description should be identical numbers even if twenty tables apart. The total percentage must theoretically be 100.0 and the detail must add to that figure. Averages must be obtainable by simple division of items shown or definite warning must be given by statement as to how they have been computed. Totals of rounded figures must be the sum of the detail as shown. An average percentage is computed from averages of totals, pet from the total of percentages. Averages should not be worked from incomplete series — a yearly average should not be worked from 11 months' data. Ayerages of derived figures should be computed in the same manner as the derived figures thus summarized. Fbr American readers, foreign money (prices and Values) should be converted into domestic equivalents. Footnotes permit much wider use of some material of doubtful character than would otherwise be preper, and they prevent many mis— takes of interpretation. Fbotnotes should be used as liberally as needed. Numbers to refer to footnotes rather than symbols or letters are practically necessary for all but the most simple tables. Footnotes run con» secutively from left to right and down the page. An unnumbered footnote giving the source and other general infbrmation about the table is often desirable. Such should appear at the bottom but before numbered footnotes. Columns of figures subject to regular revision may be in— dicated by the word "preliminary" in the box—head or by footnote. Individual figures in a column questionable on any score should be covered by a specific footnote. 359 Totals and averages are usually to be footnoted if and whenever any of the component figures are definitely footnoted. If data in a table are not quoted as given in the sources, state electly how they were obtained. Numbered footnotes should be used for sources of figures in several different columns of a table. Sources should be so definitely described that one using the date. may retrace every step taken. Every citation should be so specific that the par- ticular data may be readily identified. A blanket form of reference is permissible only when the table cited appears in the same form continuously from year to year. Arrangement of items is often troublesome. The alpha,— betic order is always safe and definite. For statistics by states, the order used by the Bureau of the Census has much to recommend it, whether in tables of all the states or for any large section. Order of importance on any single factor is a poor makeshift for careful text statement,vand gives grief more eften than it points an argu- ment. Any arrangement by size is likely to be upset at another period. Arrangements adopted should usually be maintained in a publication. Use a zero in tables if and when zero describes the known situation. When there are no data, use blank space or "leaders". A. hand book of rules prepared for guidance of clerks in the Division of StatistiCal and Historical Research and largely used in the Bureau. of Agricultural Economics may be obtained from the Bureau. sec 7 aneurysm" Graphic presentation, in the portrayal of statistics and in statistical analysis, has become an important instrument in the field of economics and business, in recent years. When properly used, it is a means of so digesting and interpreting masses of statistical material as to bring out features‘of data. that might otherwise go unnoticed. The mind does not hold bare figures easily and relation- ships between many figures are not easily and quickly discovered. Graphic presentation tends to make data attractive and more easily interpreted and therefOre more instructive. . Graphic presentation is used in analysis. It usually fol- lows tabular presentation and arithmetic computation. As it is used as a means of clarifying date, it should present them in such a way that the important points are easily discerned by the casual or average reader. Graphic presentation is not a substitute for accurate data. A chart cannot be more accurate than the data from which it is nude. In fact, untruthful charts can be made from true data if wrong methods are used in portrayal. ' In the following discussion, it is assumed that the reader is familiar with the general principles of graphics vfnich are found in the books dealing with the subject. It is hoped here merely to emphasize a. few of the principles vital to good presentation, to show how these principles can be applied to Specific cases, and to offer a few guides to the maker of charts and graphs in order that he can mice his work as useful as possible with a. minimum of time'and expense. A6 costs of good graphs and charts, and even of their reproduction, are rather high, the resulting prints in a publication must be of the greatest use and value to justify the expenditure. The fundamental principle in all graphicypresentation is simplicity. A, complicated or confusing chart will defeat its own purpose and my actually be misleading. The eye can evaluate to an accurate degree only a very few fee-Lines of a chart at a. time. The design of a chart should take into account the limitations of the eye. Of the three methods of po rtraying Values-«ml ength, area, and. volmne-u-length is by far the most easily gauged by the eye. All values should be shown by this one dimension if possible. Area, in— volving two dimensions, is evaluated much less accurately by the eye; and volume, involving three dimensions, should not be used. Angles and ratios are ordinarily evaluated less accurately than is length, but they are moderately reliable. ‘ *By P.. G . Uainsworth and The'v D. Johnson, Bureau of Agricultural Economics. 361’ Selection 9f Illustrative fiaterial The author of a publication should select the illustrative material with a view to portraying only the main points of his dis- cussion. It is possible to over-illustrate. Unnecessary maps, graphs, and photographs are a decided hindrance to the effectiveness of those which are esaential. If too many illustrations are used the continuity of the reading matter is destroyed and the expense becomes great. A complicated or involved table should not be selected for illustration, for even if a graph could-be made to portray the data it would be just as difficult to understand as would the original table. The average reader will not study an illustration unless the main points are discernible at a glance. Simple tabulations and re—. lationships make simple charts. This principle applies to photographs as well as graphs and charts. An illustration which requires much ex— planation to be understood can well be eliminated. It is well to make a small layout of a proposed chart or map to determine whether the data and method used-show exactly the points to be brought out. The layout may well be subjected to the criticism of a disinterested person because a disinterested reader will eventually read and evaluate the illustration in the final publication. The author who has a complete knowledge of all phases of the subject matter may draw conclusions from.an illustration which are not evident to the reader. If any doubt exists, the illustration can be well discarded . Line and Bar Charts Time Series {‘4‘ ' The most important line on a line chart is the one that repre- sents the data. All other necessary lines should be in the background, to guide the reader in his interpretation. Line identifications, grid lines, and all lettering, should appear to the eye as of secondary im— portance (see figures 1, 2 and 3). Time series line and bar charts are usually plotted with the time element, the independent variable, running horizontally from left to right. Short series bar charts are an exception. When space does not permit the indication of each time unit, alternate units, or every fifth or tenth unit, can be shown. A maze of grid lines is a distinct handicap to the effectiveness of a chart. The fewer the grid lines on a chart, consistent with adequate inter- pretation of a line or bar chart, the more effective the chart. The relative importance of 0 Curve in a line chart should be in accordance with the importance of the data from which it is made. '362 Usually a Solid line indicates the most important data; other designs of lines are used for other data. The curves in a line chart should be limited to those that can be easily and accurately followed. A maze of lines is merely con— fusing. If the purpose of the chart is to show the same general trend of many series of data, all the series may be put in one chart, with the thought that the trend is of importanCe and the variations of the individual series are merely secondary. As a general rule, if there are more than two lines in a.line chart, they should appear on dif- ferent areas of the grid. If they cross and recross they become con- fusing. ' . All data in a line chart are plotted in direct relation to a base line (ratio charts excepted). This baSe line is next to the data line in importance and should be so indicated on the chart. To discard a base line on a chart may lead to distortion of the data and to false conclusions except in instances in which the distance between the zero - base and the curves is not the significant factor in the portrayal (as in trend lines, etc.). In such a case the zero base can be omit- ted and the bottom of the grid indicated uy a light uneven line, below the line that clears the lowest point of the curve. -5pace alloted to a time interVal should be uniform throughout the chart unless otherwise indicated definitely by a broken grid. When plotting two or more series of data having different units (such as dollars and bushels), effort should be made to use scale units which will give curve variations of somewhat the same magnitude. True relationship can be easily hidden if distorted scale intervals are used. If a large scale interVal is used for one series and a small in- terval is used for another series, the chart will tend to distort the data. Lettering in a Vertical position on any chart should be avoided. Legends for curves should be so placed as not to interfere with the visualization of the curve. The base line in a bar chart is even more important than in a line chart and should be made more'emphatic in the bar chart. A bar chart Visualizes the actual magnitude of a Value or‘quantity by the length of a bar; except in very rare cases, the base line by which the length is measured should always be made wide enough to emphasize it. The nunber of values or quantities to be plotted within a given space determines the width of bars to be used. Usuall y the best“ 'appearance is obtained by having the bars and the spaces between the bars of_equal width. 368" Solid, hatched, or shaded bars are more effective than hol— low bars. General These principles, applying to line and bar charts showing time series, are generally applicable to all other bar and line charts. As a rule, bar charts of data other'than time series are made with horizontal bars to conserve space and give room for lettering the bar designations. Lettering or numbering in solid bars should be avoided if pos- sible, and data figures should never be placed at the end of the bar. If necessary, they can be put at the beginning of the bar outside of the grid. If bars are segmented, either on a percentage or an actual value basis, segments may be distinguished by hatched designs. If the Segments range from most important to least important, density shadings with black the most important Can be used. If segments are of the same importance, areal distinctions (see figure lg) through the use of various designs of dots and lines which give all segments about the same density to the eye, can be used. The smaller the number of segments, the more effective will be the visualization. When a series of data Can be shown either by a line or by a bar chart, the bar chart is usually the more effective. Bar charts are necessarily simple in construction and give a complete visualization of actual data. Ratio Charts Ratio charts are used chiefly to show the relative or per- centage changes in data. Equal upWard or downward slopes of lines on a ratio chart indicate equal rates of increase or decrease (or equal percentage changes) in the data (see figure 4). A ratio chart is made on semi—logarithmic paper, which is ruled with a logarithmic scale on the vertical axis, but with an ordinary arithmetic scale for horizontal. All the standard rules regarding the making of line charts apply to ratio charts with the exception that no zero base line appears on a ratio chart. Ratio charts do not show actual changes in Value of data. They will he very misleading to the average reader if this limitation is not well brought out in the discussion of the chart. Ratio charts are excellent for the purpose for which they were devised, but should be used with caution, as they are easily misinterpreted by the average reader. $364 If neither ratio chart nor semi-logarithmic paper is available, the vertical scale can be laid off in the following says: .1. By using divisions on any 3 tandard calculating slide-rule. These divisions are on .a logarithmic scale. ‘ _ By using an ordinary ruler on which the inches are divided in ,'uenths.' Let zero on the ruler equal the value of 10 or log 1,000 m: 10 inches on the ruler equal the Value of 100 or log 2-,000. Using the first figure in the mantissa of the log as inches with the other figures as fractions of inches, the logarithmic scale can be laid off accurately. ‘ For example, the mantissa of the log of 10 (or 100,1000, 10,000 etc.) is zero. The mantissa of the log of 20 (or 200, 2000, 20,000 eta.) is .301.‘ Using the first figure of the mantissa (3) as inches, and the remaining figures (01) as fractions of inches, the dis- tance from zero on the ruler or the value of 10 (or log 1.000) to the Value of 20 (or log]. 301) is 3.01 inches. ‘ n the same way, the distance on the rulerI' fromzeIro to log 50 is 6.98 inches and to 90 is 9. 54 inshes. The following Ilogariths are the only figures necessary:. ' I W I Log ari thm 10 1.000 20 I t 1.301 '-I_zo.: .~ a, 1.4%7 40 1.602. 50 1- .? ,,- 1.698 60 - 1.778 70 I ' ~ I l7 1.845. 80' ' 1.903 90 N := 1.954; . 100 , 'I'V 0J7-12.000~ The above may be repeated for any increase needed in the scale to fit the magnitude of the data, by increasing the characteristic of the log. The range between the characteristic of the log of the lowest value to the characteristic of the log of the highest value in the data gives the nurmer of decks which the chart requires. For example, if the data. ranged from 100 to 10,000, the _characteristic of lOO is 2 and of 10,000 is 4, and their difference is 2. Two decks would therefore be necessary to plot the data. ' After the vertical scale is laid out and grid lines are drawn, data are plotted as in ordinary line chart. By converting original data into logarithms and plotting these IOgarithms on an ordinary arithmetic scale, the same shaped curve will result as When the original data is plotted on a logarithmic scale. This method, however, is used only in rare cases in technical statistical analysis and has no place in publications of' general character. 1113 Diaggarns Pie diagrams or componentepart diagrams are used to present a number of individual items in relation to their sum (see figure 5). ~ In constructing a pie diagram the data should first be expressed in percentage form and the Various segments should then be laid out With the help of a percentage protractor or degree protraetor. Plotting of values should begin at the top of the circle. Density shadings can be~applied to the segments of a pie diagram to bring out their relative importance. ' Proportionate sizes of two or more values can be shown by variap tions in the areas of individual pie diagrams, remembering that areas of circles vary according to the squares of their diameters. Pie diagrams are less effective than bar charts in portraying nest data because the eye measures length more easily than angular widths. y Scatter piagrams Scatter diagrams, also called correlation charts, or two-way frequency charts, are used as an analytical device as Well as a method of graphic presentation of the relationship between tee variables. The dependent Variable is generally plotted on the vertical Scale and the independent on the horizontal. The position of the dot representing any observation is determined by reference to its'numerical Value on the two scales at the intersection of the two values projected. The scale should provide for the full range of observations, and the range of scale and type of data will determine the general shape of the chart (see figure 6). A lack of any rather uniform arrangement of the dots indicates a lack of close relationship between the variables. ' A regreSSion line, or a line of best fit, may be drawn on a correlation or scatter diagram. Values read from this line will give an estimate of the change in one variable which is associated with a change in-the other. This line of best fit may be either straight or curved accordingzto whether or not the correlation is linear. 366 sigma slants. Pictorial charts are composed of symbolic sketches of various objects such as men, animals, houses, or sacks of grain, designed-to give an impression of the magnitude of the items. These charts are to be avoided, because there is eVery chance for error and confusion in reading thenu Pictorial charts are generally drawn. in We and three dimensions and the eye is not trained to interpret accuratelyitems deal- ing in more than one dimension. , The very nature of the pictures will give solid figures out of proportion to the data, and will serve only to confuse and at times even to misinform the reader. Mi scellaneous Charts Numerous kinds of charts, made for various specific pur- poses, are not in general use in research work.~ As a rule these charts are a combination of the general types discussed. Organization charts showing the organization of a unit from the head to' the subordinates, calculating charts, three-variable cor- relation diagrams, and many other types are used in specific cases. Calculating charts and correlation diagrams must be accurate and pre- cise. Their interpretation is limited to persons who are familiar with the subject under consideration. 9.9.2 Mm Dot maps are excellent for cenveying the correct impresaion of "location" to a reader. is a rule, they are understood at a. glance and are subject to few chances of misinterpretation (see figure 7). ~ The effectiveness of a dot map depends to a great extent upon the scale selected, the size of dot used, and the actual dotting of the map. In determining the scale to use for one dot, the range of values in the data from the highest to the lowest must be considered in order to have a reasonable number of dots distributed over the map. If the scale is too large (that is, if each dot represents 1,000 acres instead of 100) the dots may be too few in number and the map may not show distribution as it should. If the scale is too small, too many dots may be required and variations in concentra- tion may be lost because the whole map may be too black. The actual size of dot to be'used is determined to a large extent by the number required to be used by the data for agiven scale and the space on the map available to place the dots. The dot must be large enough to reproduce well, individually, and must be small enougi so that they do not shade large parts of the map. too ,up--—— 367 V- heavily. It is necessary‘from the data to decide whether it will be better to use a large scale and large dots or a small scale and small dots. ‘ As a rule, small dots with a small scale are more satisfactory. Large dots: and a large scale are used only when the dots are concen- trated heavily in one area and are infrequent in others. Summetrical even—sized dots are necessary to make the map appear well balanced. Even contact of pen with paper with a rather generous and even flow of ink from the pen will generally produce the best results. A circular flat pointed pen is necessary. Dots should not be placed in'rows or bunched to form a design of any kind. . Attention should be given to the placing of dots in relation to‘tOpography. In county data, where a. certain number of dots repre- sents: some agricultural feature, the dots should be placed if possible, in the areas to which the data applies. In mountainnus and rough ”3513, and in irrigated sections, the dots should coincide with the actual producing area, if possible. Topographic relief maps and maps showing physical, features are excellent guides in placing dots. ’ The number of dots on a map should be checked against the original data. The use of base maps with blue county lines and names is a great help in dotting the nap and in checking. The blue lines disappear in photography and reproduction. The scale should always be indicated on the map. Hatched Maps Line-shaded maps, showing sections that vary in density or are of approximate equal density but of different design, are used ex— tensively in locating data in its geographical position. The shading of the areas is a distinct help to the eye in grasping the significance of the data (see figure 8). When a density shading map is made, with the density graduat~ ing from black to white, the most important areas (or the areas in which the magnitude of the data is the greatest, etc.) are given the greatest density. When a hatched map is made, showing areas of equal inportance in a discussion, the density of the areas should appear about the same to the eye, the design being different. A series of density and areal shadings are given in Figure 10 as an example of the Various designs that can be used. Hatched maps of the United state's,de on a state unit basis, are not the best for portrayal of agricultural statistics. The state boundaries do not encompaSS agricultural areas in many cases. County data, grouped to make larger areas and disregarding state lines, are far more accurate in portraying agricultural conditions. 368 Hatched maps are difficult and expensive to make. A well-shaded map, which will reproduce well, requires an eXperienced draftsman with good equipment. A section-liner machine for drawing parallel lines an equal distance apart is a great help. Contrasts in density shadings are important. If too heavy shadings are used the map will appear too dark to the eye and will not reproduce well. All lines used in the map should be able to Withstand easily the reduction necessary in printing. White spaces betweenblack lines on a hatched map closed up appreciably in printing, and the repro— duction of a hatched map as a rule appears darker than the original draw- ing. Original data for hatched maps should be subject to no changes. Few alterations of shading can be made after a map is once dram. A legend on the map should give the ranges of the hatchings according to the data. As a rule the fewer densities shown, the more effective the map. The range of data for each distinction should be as consistent as poSsible with the range for the other distinctions. No printing should appear in the hatching on a map except in cases where the area is large enough to hold the lettering without de- tracting from the visual effect. 0 ircle maps The placing cf circles in the divisions of a large area is used to portray certain features of an area... Circles can be placed in the s ates in the United States or in counties in a state, he size of the circle Varying with the magnitude of the data pertaining to the state cr to the county (see figure 9). lthis type of map is not ac- curate in interpretation because a circle shows an area which is two— dimensional and varies as the square. of the diameter. The eye does -. not make this comparison accurately. ' The size of circles to use depends upon the limits of the space in which they are to be placed. If the circles are of sufficient size, they may be segmented into component parts to show divisions of the whole. Circles can be hatched or shaded to make individual pie diagrams. - As open circles are not striking to the eye, all circles should be either shaded or black. Miscellaneous Maps ., Line and bar charts may be superimposed on a map to show certain data in geographic position; movement or "flow" maps can show the marketing channels followed by commodities; international 369 trade maps can show the movement of a commodity between Countries. As a rule, these rather comlicated maps require skill in making if they are to appear well in publiCation. Identification maps and key maps are used to give the setting or location of a particular area in relation to general physical features. These maps are generally a combination cf :9. hatched. or dotted map and a topographical map. The more simple they are made the greater will be their value. ‘ Topographical maps, showing in detail the water areas, land relief (mountains, valleys, etc.) in contour, and towns, cities, roads, railroads, boundaries, etc., are valuable aids in all kinds of map making, The U. S. Geological Survey at Washington, D. C., will supply on retuSt detailed survey sheets for large portions of the United States showing these tepographical features. ' Graphic Standardizat ion Any organization that has uSe for various types of graphic presentation should consider some general principles of standardization to be used in the making of the graphs. If the original charts can be standardized, the uses to which they can be put are greatly increased. By standardization, here, is meant the operation of a system by which the general layout, the sizes of all the charts and. naps, the sizes of the lines and lettering, and the actual drawing of the chart in general, conform to rather definite rules. It will be recognized that uniformity and completeness in a series of charts or maps, as finally published, add a great deal to the value of the work. The uniformity of lettering and width of lines, and the general‘eppropriateness of illustrations, add much to their effective- ness ‘in a. publication. All illustrations for publication should be made in accordance with the reduction which will be necessary. Many charts are made which will not be effective in reproduction; for instance, the lines and letter- ing may be either too large or too, sml to appear well in reduction, Z’iall charts or charts of large dimensions can be made under a. standard system so that they can be photographed and reproduced as il— lustrations for bulletins. It is logical that the larger the chart the larger the lettering and lines should appear. If all the lettering and'" lines on all charts, whether of Wall chart size or smaller, conform to a set standard, the chart can be used for many purposes other than perhaps at first intended. Small charts can be enlarged towall chart size and vice versa, and. the result will always be a Well-balanced illustration. The rec0gnized principles of-graphics are more easily adhered- to if all charts are made under a standard system with certain set rules- etc A standardized set of instructions to a. draftsman helps to keep errors at a. mi nimum. Standardized graft ing Technique To the draftsman who is actually making: the illustration, standardization of technique is of great benefit. The draftsman, as a rule, is not a. statistician and should be depended upon only to make the chart as directed. If a carefully planned layout is given him, together'nith all sizes of lines and lettering, he can devote himself to the actual mechanics of the drafting. .If all charts, then, are made by this same pattern, either by the same draftsmsn or by several, the final charts will be uniform and will portray the data - in the most effective way. . tandardized technique increases the Speed with which the chart can be made. The time required for a draftsman to determine the general layout of the chart and the size of lines and lettering,r is con— siderable. Nbreover, he often has'notthe knowledge of the subject matter to determine the most effective style of chart to be made. AS a rule, the draftsrran should not be required to know all the principles of graphic presentation. His ,job is to follow directions and if a complete set of directions is given him, his work is more efficient and effective. A study of Figure ll will give one an idea how this is done. In the lower right—hand corner of this illustration is a. reproduction of an experimental layout. Such a layout is furnished to the draftsman with each job to be done. On this rough layout is definitely indicated the sizes to use in making all lines and lettering. The sizes of lines refer to the scale of lines furnished each draftsman. The size of pen is then selected to make the right sized lines. Any good set of pens having a full range of sizes can be used in the work. If Figure l; is enlarged to three times the size here shown, it will show the actual size of lines and lettering to use on charts for one-half and full—page bulletin illustrations. The reduction lines indicate the size the lettering will appear in print after reduction from 16-1/2 inch original to the 4-1/8 inches. The heavy diagonal line running through the one-half and one-page layouts can be used toyindicate the size cf lettering to use for any sized original between 16—1/2 inches and 4—1/8 inches. The size of lettering is shown by drawing a. vertical line from the heavy diagonal line down to the dashed "reduction line", and measuring the width of the Space between these two lines. ’ All sizes of lettering are indicated in the metric system which is in use by the Coast and Geodetic Survey, which furnishes the standard lettering sheets at a nominal cost. In making mall charts or other charts that are larger than those here illustrated, th standard sizes of lettering and lines are enlarged in proportion to the size of chart made. In this way, no matter what the size of the original drawing, it can be reduced to a size to fit bulletin publication. Maps and map charts Can be made to conform to this standard by making the base maps to fit the requirement. Lettering, size of dots, etc., are then indicated as on an ordinary chart layout. All charts should be accurately checked when completed that the portrayal may be exactly'in accorde cc with the data. Before sending them to the printer, the size to which each chart is to be reduced for publication is marked on the margin. In an organization employing several draftsmen, a record of the time required to mAke all charts will make it possible to draw up budget of costs. This time record usually discloses the fact that some draftsmen here a natural ability ta do certain jobs Well, such as lettering, dotting maps, etc. Jobs can then be assigned according to each draftsman's ability. In this way the production of the unit is kept at a maximum and more satisfactory work is turned out with fewer errors than when the jobs are assigned in a routine way. Eraiiins.lssi2neni In the drafting organization of the Bureau of Agricultural Economics which does general map and chart work, it has been found necessary to hare the following equipment for each draftsman: Drawing table with cover and chair set Bareh~Payzant pens, of Leroy or equiValent make "T" square or straight edge Triangles (45°, 60°, and 3 in 8) Engineer scale rule Lithographic needle (No. 4) Protractor (100 divisions instead of 360°) Bottle black waterproof indie ink Penholders Steel eraser . Dozen thumb tacks No. 290 Gillott pen points No. 170 " " " N O . 303 I! ll 1! Scale of lines adapted to type of work Samples of spacing for lettering set of alphabets adapted to type of work Set of calendar sheets, 1917 to date Set of State maps with counties Postal Guide 6H pencil Hua Hideanikdnannm FIFJRDP'F‘H P’GIP’F‘H 372 2 No. 2 writing pencils 1 3H pencil 1 Pencil Pointer 1 Van Dyke eraser No. 6000 1 Van Dyke soft eraser No. 6500 SEmfllbbtmrs l-Pen wiper 1 Pair 6” scissors 1 Roll adding machine paper Special sized wall charts, display panels, and poster work may require brushes and paints, but this type of illustration is not used to any extent for reduction and publiCation. Care should be exercised in the selection of paper upon which to make original charts. This paper should have a smooth hard finish, either cold pressed, calendered or fine textured, which will allow for at least three erasings on the same spot Without wearing through. Tracing cloth can be used for some purposes, but it is not as gener— ally satisfactory as white paper. Exagples 9: Types Q; Illustrations The following illustrations Were selected as examples of the more common types which are in rather generAl use. Specific reference is made to the most important features of each illustration, with the hope that these principles may be of use in constructing illustrations of a similar nature. Figure 1. Line Chart. 1. The lines representing data are the most important to the eye. 2. The types of lines used are according to the relative izportance of the data. 3. The base line is, next to the data lines, the HDSt ins portant to the eye. 4. Grid line , forming the field upon which the data are plotted, are limited to the most neceSSRry, are of even weight, and are only heavy enough to reproduce. 5. The data curves are not labeled in the field of the chart as this might interfere with visualization. o. The weight of the lettering indicating the horizontal and vertical sexles is only heavy enough to be easily read. Figure Figure Figure 7, 373 ' Scale captions clearly indicate the units used in plotting. 2 3116. 3. Bar Charts. 1. 1._ 2. 3. 4. 4. Time series bar charts generally are vertical except in short series, whereas other bar charts are usually hori- zontal. The Space between the bars equals the width of a bar. The zero base line is prominent. All other grid lines are only heavy enough to reproduce. The scales are clearly indicated. Black or shaded bars are more striking than hollon bars. There is no vertical lettering on the chart. The scale unit intervals are so indicated. that the i approximate lengths of bars in terms of the data can be easily read. Ratio Charts. The general construction is similar to a line chart. The 100 base line is heavy becauee index numbers are plotted on this particular chart. There is no zero base line. The plotting of all data points whether falling on or between the indicated scale’lines is on a logarithmic basis and not on an arithmetic basis. Pie Diagram. Pie diagrams are not as effective in visualization as one dimensional figures, such as bars. The plotting of values begins at the top of the circle and proceeds clock-vise. The circumference line is heavier than any other one on the diagram. ~ Hatched designs are used to define more clearly the segments. 374 Figure 5. ‘1- Figure 2. 3. 4. 5. 6. Hatched designs varying in density from black to white are used to designate the relative importance of the data represented by the segments regardless of the . size of the segments. Hatched designs of approximate equal density (areal designs) may be need for segments that are of equal importance. Different sized circles are made to represent different magnitudes of data, the areas of the circles varying as the squares of their diameters. All lettering is so placed as to be most easily read from one position. Dot Map. The allocation of dots is determdned by the data for the smallest possible geographic unit for which dam are available, regardless of the geographic divisions shown in the final reproduction. The dots are placed in definite relation to'topography and other related factors,tdthin the geographic unit ” designated by the data, such as township, county, or state. The scale was selected to give the truest visualization of the data. The dots are uniform in size and large enough to be definite and clear in reproduction. The dots are not placed in rows or to font any design. The legend indicates What each dot represents. Hatched Map. ‘ The density 5 hadings may from black to white in accordance with the relative importance (or degree of intensity, etc.) of the data. ' The black area covers a cemparatively small portion of the map. The scale and shadings were selected so as to bring this about. hire 5. 6. Hatched designs show clear distinctions. The legend of shadings appears on the map. The range of data for each hatched distinction is uniform. Shadings of approximate equal densities can be used for data of equal importance. The boundaries of the hatched designs are distinghishable from boundary lines of geographic divisions. Lines and dots in all designs are of sufficient weight to reproduce and are far enough apart to prevent filling in printing, (Hatched deSigns which cause optiCal illusions are never used.) Circle Maps. The areas of the circles vary according to the magnitude of the data. The range of sizes of the circles is dependent upon the space in which the circles must fit. Black or hatched circles are more strikin to the eye than hollow circles. There are no figures in the circles. A scale of sizes of circles in terms of areas or diameters is a help in interpretation. Two—dimensional surfaces like circles are not accurately evaluated by the eye, but circles are less misleading than squares. The circles may be segmented if large enough so that the segments will show in reproduction. .1mH"..uH;." 7 M . n J '.'-‘ .3“ v ‘ ‘13,; J1 ’ (art; 31"}. :5 * ML“ .4: 376 COTTON PRODUCTION OF UNITED STATES. EGYPT. AND INDIA. I89I 'I926 SALES MILLIONS I United States '8 India - - - Em” / I 6 I 4 A A A | A l A L A A A A A l I A A A l890 I9I0 FIGURE I I895 I9|5 I920 I925 |930 VOLUME OF BUSINESS AND COST OF HANDLING GRAIN. I9| I-I2—I925-26 THOUSAND CENTS BUSHELS HANDLED I80 SASKATCHEWAN COOPERATIVE ELEVATOR COMPANY. LTD. I50 I20 90 60 30 Volume of business I9ll-I2 A PER BUSHEL \‘_ _.__ Hand/Mg cosf \ \ \ ISIS-I6 I920-2l FIGURE \ II \ I \ I \ TI I \ I \ __\ 6 l925-26 BELGIUM -------- UNITED STATES-- UNITED KINGDOM RUSSIA(EUROPEAN) GREECE ARGENTINA ------ BULGARIA ''''' INDIA '" RUSSIA (ASIATIC)" INDEX NUMBERS 200 IOO 80 60 50 40 30 20 I o l865-‘66 I875-‘76 1885-86 I895-‘96 I905-'06 I9I5-Is l925-‘26 3'77 CULTIVATED AREA PER WORK ANIMAL IN SPECIFIED COUNTRIES NUMBER OF ACRES Q 5 l0 IS 20 l | 25 FIGURE 3 TREND OF PRINCIPAL AGRICULTURAL EXPORTS. I865 ' |928 YEARS ENDING JUNE 30 - = ALL COMMODITIES l9|0 |9|4 IOO c/ Conan ALL COMMOD/ [5 "—I‘ I’D ’ ‘_l - I 'I ’ ‘\- ’ 4113/ I I ’4’,» COTTON I870-‘7l I880-‘8l I890-‘9I |900~'O| |9|0-'|| I920-‘2I Fl GU R E It I930-‘3I afimyudxz 83%»; ,:.,: “.1“; :....,,,. «”1“... ‘ ' , 1 7 .. .4 I _ ' f Mr“ ._ ~ -‘ .1- _ . ; ,. [rah F maxim}- m,“ ; _.'._.-.._ .‘ oer-3:1.“ . 3 {f1 PM?“ ‘ mm '1 . s:\-{‘.‘fi¢‘ #17,”;3 u “‘“"'" ’, ,. , . M Ww 44»: :8v1"?-~.;:« ~ ~ 85.2 MIXED WHEAT ALL DISTRICTS EXCEPT THOSE OF PACIFIC COAST DIVISION WHITE CLUB 0.1% w.c.TRACEx c.w. 6.31 , lI-I.Ia.s.7.3% .a.‘ ' I! ’30.! I ': [/llll Ru“ .7: .n. . 32.0 RECEIPTS 82.67: SHIPMENTS I141 PACIFIC COAST DIVISION . . ‘- 'ch'MbN' wn'ITE_'.'-,'. - ' '_ 457.87.. RECEIPTS 88.87, SHIPMENTS ”.21 FIGURE 5 ~ "_ . ‘j' a. A . SEWER»: .4!“ .L‘sé-i :17 :7“ . y“. .i. -- ," ‘ ‘ . 4 4!. aim-1x1!” tan-guy”; [“9“ 1.. ( 5’79 RELATION BETWEEN COST PER POUND OF GAIN AND PROFIT PER HEAD OF STEERS PROFIT PER HEAD DOLLARS 48 32 . ' ' nodrrr nu pun-Ma,“ ofcasr per pound mmmtxuum . ’ ." r. I -.6558 b - L78 a. . Una '28 Promo/.75 Coarv llo I8 22 26 30 34 38 42 1.6 so '54 58 62 66 7o 74 78 82 86 90 COST PER FOUND IN CENTS FIGURE 6 VALUE OF CROPS. 1921+ Omng ra change: in the crops Inc/udod and 1/1 price lava/s, flu: map it no! :fricr/y comparably WI"? Marci Value afA/l Crap: in (9/9 Each dot rep/mun" 1% 1‘ one million dollar: FIGURE 7 , . v, .». 1,1,“. MHz—”(Quwvfi‘ ”A“, a; - 380 FARMERS HARVESTING WHEAT PERCENTAGE OF ALL FARMERS. |9|9 /2o-39 m“ - so “so -69 -800ndavor FIGURE 8 J33 SALES AND PURCHASES THROUGH FARMERS' BUSINESS ORGANIZATIONS, I924 AREA or 9 Rths LLLLLLLL AI VA LUE EQUIVALENT OI LLLLLLL FIGURE 3 '.,~ v V 2"“ 5a.... 1.. .. a» v ‘ ‘ »' " ‘ A .v 1 - \ "'5’. - M: i" ‘/ 381 AREA DISTINCTIONS g; Ix vv‘, ”‘4‘ < x76 v x v . . "vxaxw v 111111111 DENSITY DISTINCTIONS 11111111 11111111 11111111 1111111/ 11111111 11111111 11111111 11111111 11111111 11111111 .1-1.1-1-1 .1- ,.1 v1.1.1.1 I 1.1.4 .1.1.1.1 FIGURE IO 3“ An: , (.5. w- ,a,..‘._ , ' . 1 M ‘ ’ .. , ' K ,. ‘ u * ,x, ' "“ ””2 Nb" ‘0” ”"' ’ “31% '_l:'d"»a?"». ,K: 1‘ _. x who A : HIDE—u dukfifhbqi MIN! h< isaflk‘g 025; 1.. l..un« . 1.33am ukhxn Qhkk<3 .hutikot‘ufik \\\§§ gex .S§x‘4 NQ‘k \ d<§\Q\QQ 44. OPP/«.5344. zofigomm .._o m._n:oz_xa I mmz... oz<.mzma.mSo><._ am Class Use The medium which takes first place in direct presentation of research material for class use is the technical bul— letin. This use for technical bulletins is coming to be recognized 398 as sufficiently important alone to justify their publication. In a. great many cases, this use can be combined with use by college-trained farmers, farm journals and other equally eduCated people who have an- interest in agriculture. :- -" Next-in order is some discussion of the contents and organ- ization of bulletins for class use. Following are a few observations "growing out of experience; ' \ ‘ 1. Such bulletins should not include elementary analysis of the sort available in textbooks in economics, marketing 8116. the like. 2. They should.include, however a. careful statement of whatever qualitative analysis is new to the project and is basic to the quantitative work following it: also reference to, and in some cases, a. summary of, necessary qualitative analysis that is not readily available to . students; for example, if available only in a foreign language. 3. The foregoingwill mean in some cases the preparation of technical bulletinsrin which the qualitative analysis is more significant than the data,— perhaps the data merely illustrate some qualitative analysis. There is a disposition- for bulletin editors, or pub— lications committees, to disapprove of bulletins largely qualitative, or to ask that the qualitative portions of bulletins be eleminated or greatly reduced. This reflects their lack of familiarity with the social sciences. The economics department may need. to educate them on this point. In the past, however, the economics department when thus treated4has more often than not been trying to include more elementary discussion-of principles as research material. 4. 'A great deal of_-descriptive material is needed for teaching purposes, especially in elementary courses. In- structors should not‘have to sepdn their time in class presenting facts and data. These should be available so ' that the students can get them themselves by reading. Some of them are mrth having students take notes upon in the reading, but for the most part, having them avail- able for reading and later reference without copying is to be preferred. It is‘highly desirable that the limited class time available'be spent interpreting descriptive material and data and in developing principles by induc- tive methods from these. The classroom discussion should be much more in terms of ghy than what; and the what questions should relate to appliCati-on of principles and inferences. 399 The question ‘Lvill be raised at once as to whether as— sembling; facts for class'use is research. Some of it surely is not, and should be left to the teaching staff. This statement surely applies to the mere assembling of data from census volumes, or from surveys made at other stations. But that which is collected by surveys from direct sources is surely research material, but as al—' ready pointed out, perhaps of a low order. The redeeming feature is that the survey itself need not - in fact, ought not - stOp with mere description; and any report of the survey needs to combine some presentation of descriptive material with the analysis of it - probably enough for most teaching purposes. It is highly pertinent to remark here that a fact may be only a fact, and have no value outside of itself. A Vast amount of almost valueless descriptive material is constantly being inflicted on students. Primitive curiosity about facts is constantly leading even our best scholars astray. One would expect economists to be especially careful to distinguish the relative values of different clasaes of facts and save students from those that lead to no action or are about as well for- gotten as remembered. Some facts have value only to a, few who are specially concerned, or when some particular problem arises. Knowing, how to locate many facts is far more important than knowing them. 5. There is a question as to how detailed an interpreta- tion of data should be included in bulletins for teaching use. There is no doubt need for some data to be pub- lished with little or no interpretation so that the students can practice their own powers of interpretation upon them. This need can be met partly at least by publishing; significant portions of the source data in appendices. Teachers and students will always be able to find other analysis to which such data can be put. Students can review the data to see if the author's conclusions seem warranted. For the most part, no doubt, the need above described will be met at each station by mimeographing such data. 6. Particularly until the time when a much larger number of adequate treatises and textbooks is available, and even after that, because treatises and textbooks are seldom in the vanguard of progress, we shall have need of bulletins of the sort that constitute essentially the text for courses. This means that they must be much more than collections of data; that they must recognize that they are pushing forward the body of qualitative and quantitative principles and statements that make up ieconmnic science, and contributing that Which will ‘jlater be Forked into treatises and textbooks; In ' the newer. fields of agricultural economic research - II " .Iq'rédit. and taxation, for example, the foregoing need ‘ is Ibest illustrated” 7. From'thé standpoint "of needs of teaching, technical 'bulletins' should include all explanations of methods {used that is net available in textbooks, and even when the methods" are entirely conventional, should explain ItheI details of application to the particular problem. For example, the units and measures and the Variables used should be defined. If there are any aesumptions IIIiIxnplicIit in the analysis, such as no change in the demand schedule or income Varying with prices, these " ‘ should be very carefully stated. There cannot be too ‘ 'much care given to vigorous thinking in material which is laid‘ before students. The foregoing particularly applies to bulletins that are likely to be usedII in courses in statistics and research methods. ' 8‘. It is hardly possible to set before students too many bulletins which are in proper scientific form, WhiCh show on the face of them that the proJeCt has been approached and‘ carried through in scientific " order. Scientific thinking is a habit, and is learned by conscious repetition at the start. (b) Scientific , Articles The very best of the new developments in agricultural economics should at once be published. in economic journals so as to give‘them wide Circulation among co—workers. Teachers can then bring them to the attention of their students in classroom discussion, and graduate students can be asked to read them. The best technical bulletins in agriculutral economics fail to reach half the persons who have need for them. The scienti- fic standing of agricultural bulletins is such as yet that busy peo- ple do not know what they sometimes miss when they pass over announce- ments of them hurriedly. Even the Journal of Farm Economics is not enough of a recommendation for the most important developments in the field. (c) 3911.21: Exercises The best practice in most cases is to duplicate data from surveys and the like. Each institution needs its own, and the material needs to be fresh. ' ' i 401 (d) Slides V Excellent use can be made of slides for presenting charts and even tables of material which is likely to be useful over a period ~ such as curves of price movements, statistical demand curves, . exPorts and imports, localization of production. But it is easy to stock up with slides that are seldom if ever used. Bulletins and books soon publish the best of this material. The blackboard serves very well for a year or two, and by that time some better material may be aVailable. The newer machines for throyving illustrations on a screen directly from the pages of books are proving; very useful. (a) Mimeorraphing Material. for courses not yet avr ilable in books can be assembled by getting a supply of bulletins together in the library; sometimes by having each class buy or send for the newer ones. Still other mterial can be mimeogr..phed. The collection of mimeograph material may be increased by special sections written by the department, and eventually it may be Wise to prepare a book. (f) Books Most persOns who are Working up a course have a strong; disposition to feel that none of the books available meets the situation,'£uid that they have in mind exactly the contents that are needed. It reflects, of course, their ideas as to emphasis, based on their own past experience and training. It may be a very wrong emphasis; but it does not so appear to them. Some reason more potent than a different idea as to centent should ordinarily be re— quired before a new book is launched. There Lust be considerable fresh materiel and analysis, and the prospective author should be one . who has developed considerable of this new analysis himself. Ordinarily he should have published several bu letins or articles in the field before he undertakes a book. Agricultural economics needs more treatises and textbooks; but they should not be written till the material is ready for them. Books are being published rapidly at present in the field of economies. Many new publishers have come into the field and they are grabbing manuscripts recklessly. Leng~establishod houses are in some cases taking manuscripts to keep their new competitors from getting them. ' Young soonomists should realize that their first book largely sets their colleagues—3's measure of ther: for all time. TWO books well done mean vastly more in a professional life than a down ordinary ones. It is said of the wrf- ter of a few good books that he is a very able person; of the writer of the dozen ordinary ones that "he is a hard worker, but of course a second-rater", Most of the writers on the utiliZation of research material in teaching stressed its great importance. They look rpon the teach— ing as developing the future county agents, agricultural teachers, mamgers of cooperatives and farm leaders, and emphasize the importance of giving them sound training: in coonomics in their college days. 402, One of the stations reporting on the subject of utiliza— tion of research results took the trouble to inquire into the use of their bulletin material by other departments in thicollegee. No doubt the reports obtained were typical. They indicated that the natural science departments are glad to have data and analysis for conditions in their states. They like to have price data. assembled also. Frequently they make larger use of them in the classroom. They have wanted economic material to use in their courses, but have not known how to get it or more. On the other hand, these other departments did not think that the conclusions were always exactly adequate, overlooking as they did important technical points. rJ.‘h:|.s is further evidence as to the need of joint work on research pro- ,jects. This raises the related teaching, problem of how to give students the preper combination of economic and technical training, best illustrated by marketing. Assuming that studies have been published covering both the economic and technical aspects of marketing of potatoes, should the material be reviewed in courses offered in one or the other department, or in both? or how? All possible arrangements exist somewhere. Those which make it needful for each to present its own phases of the subject and those only, are working out best. 403 PART FIVE Tin; QRGAMUJIIJH JJ‘ .fldCDLI‘LfiJL, LOLCi‘tIt-S LESL'iARJH If a division of agricultural economics in an agricultural college or other institution is to function adequately and fulfil its obligations as part of the institution of which it is a part, those in charge of its destiny must give thought to a rumber of important prob— lems of policy, relation to other subject-matter departments, relation to other functional departments, and cooperation with outside agencies. It is the pumose of this section of the handbook to discuss such prob- lems of this from the standpoint of the agricultural economist in charge of research in his field. Vievmoints in . . If the agricvltirnl economist in charge Econcmig of research is to think clonrly on the fore- Research going, subjects, he must first of {ill have in mind the different strndroints from thic‘n economic analysis may be spurouched. First is the distinction between he pure and the supplied, n-distinction that runs across the whole field of science. Pure science or. the one hand is taken to mean a body of L:cnerulizations '18 to the relationships be— tween things without primary concern as to whether the 'lmowing; of these relationships is of a y value to Eibeyd, as distinguished. from that kind of science in which the problems attached are those whose solutions will load to human ends desired of themselves. From another angle, pure science is more abstract and more universalized then applied science, which to quote marsh 1, ”deals with narrower questions moro in detail"... mess the lines of distinction only partly coincide. The point of- View of science for Science's sake may be carried over into the details of special fields. It is being: done so increasingly these days, and many fail to recognize that it may still be pure science. In general it must be admitted, however, that those working within the confines of mecial fields are mainly seeking for direct human ends. There are differences to be noted in approaches to problems of serving human ends. One appronch may be to find out that practices and procedures commonly seen to attain these ends and then prescribe them. Another is to discover the relationships between things from which adentageous or disrxdvnntL eous results ensue. The first is empirical; it provides the material for trade—school education. The second is thorougdily scientific in spite of its objectives. The first as a scheme of education produces a men who gets along; well as long as nothing; changes. The second produces the chauzge~makerm Economics islone of the fields in which the first kind of on aprroach is most likely to lead to evil—claysw ‘ Maladjustzientn, which have been the 404 theme of more of the agricultural economic discussions than anything else in the past seven years, are what results when people act with- out understanding. ”The man who knows how will alivays have a job; the man who knows why will be boss." The distinction between social and individual must also be kept clear. It cuts acrOSS both pure and applied science, and across both of the applied approaches. The pure science of individual econ- omics relates to the economic behavior of individuals; the pure science of social economics to the economic behavior of groups. How the eco- nomic behavior of the individual is conditioned by the economic group of Which he is a part, lies in the border zone between these two. The social or individual point of view in applied science becomes a matter of objectives or ends. In individual or entrepreneur economics, the objectives are priVate - profit-seeking and the like; in the social aSpects of the applied science of economics, the objectives are the larger ends of humanity - that is, human welfare economics. Very frequently indeed do these ends come into conflict with private ends. Boonomics is usually described as a social science. As a matter of fact, in strict logic it is a social science in the same way and to the same extent that psychology is. Both have an individual as well as a social aspect. Saying that economics is mostly Social does not mean, however, that it does not concern itself with individual ends such as profit- seeking. It is social just as social psychology is social, or social history. Private profit-seeking is socially organized to a high degree and socially conditioned at every turn. Some economists have in mind, when they call economics a social science, that it concerns itself with social ends as distinguished from priVate ends. They are thinking of social in the sense of social objectives. They are in the field of applied economics; yet these same persons are likely to shy at applied economics. They fail to realize that economics directed to social ends - welfare economics — is as much applied as entrepreneur economics. Political economy, strictly defined in the sense of state or public economy, is as much applied economics as is farm management. The distinction between social and national ends is also important in times and places. It is very easy for the two to come into conflict with each other. A familiar but important contrast in applied economics is that between short—time and long-time objectives. The latter are so difficult to introduce into analysis that they commonly receive too little consideration. " 405 National and social may either both come into conflict with various regional or lQCal o ojectives. Eerriment stations easily drift into thinking in terms of purely local objectives. tudies may be regions M1 however, and still be highly in— spirited with national or social objectives. The great social or national ends miist finally oe Worked out in tonne of local conditions, and research looking to such local realization of ends may have a higher order of social objectivity. Just because a study is local in its habitat, it is not necessarily local in its airs A study of land Utilization in a limited area may develop principles or methods or analysis that have wide usefulness or appliCation. mhe Loc caliz Vation Secondly, the pers son in charge of a of_ stearch program of agricultural economic researdh Problems : should do some thin12ing as to the geographic distribution of research problems and of -ageneies and facili ties for working upon them. There is no particular reason that the two should coincide. AS a matter of fact, they do not. wiany pzoblems are cor;mbn to a whole group of states, with perhaps, however,n finportant local differ- ences. For exanple, severe? adjoining state 3 may grow hard winter wheat, but the other crepe grocn tdth it may be dii‘ferent A state may think that some problem connected with wheat is suificiently in» portant to it to investigate it thoroughly. But if a neighboring state does the Same, Hmch of the work will be done twice. The local flfl:ifferorces that appear when several states work tog ether on a regional prob lem often are helpfil to an understanding of'the whole. In some cases, only a small area is involved, but it is split between two ssates. ' ‘ . Another situation'is that in which several states dispose of a product in one consuming center L for.example, milk in the Chicago market; or at the sane time in several competing markets, as in New England. The marketing system for a commodity like cotton reaches into great national and international markets. While some important national markets may be localized in one state, they are also of interest to other stateS. In other oases, livestock, for example, a whole set of inter— related central mnkets is involved. A state like California may be in- terested in some foreign markets that no other state is; but this is not the usual situation. \ Oftentimes it is extremely desirable that the production and the consumption be analyzed in relation to each other - and they may be at opposite sides of the continent, or even in different countries. 406 There are many problems common to all states, but of such a nature that they need to be studied locally - for example, rural living-- One state may have a dozen sets of local rural living conditions each of which needs to be analyZed by itself. But it will be highly unfor- tunate if these loCal studies in different states are not made on such a basis that some condusions and principles can be derived as to rural living in general. Taxation and credit illustrate this same thing. In many of these, questions of broad national policy are also involved, making it advisable that they be considered in their entirety. some have to be considered from this point of view almost entirely - for example, the tariff and immigration. Lastly come the problems in pure economics, both general and special, and in developing methods of analysis, which are mostly with— out habitat. W The outstanding features of the local- W iZation of facilities for handling research Facilities. problems are the following: 1. Local experiment stations find it ex- pensive to collect data in distant markets, and. are almost necessarily poorly informed as to .the problems of such mrkets. If only one such market is involved, these difficulties are not so serious. 2. If local stations are to undertake many problems, they need data on a large scale or over a wide area which it is difficult or ex- pensive for them to obtain. 3. Local stations are constantly under presaure to undertake local and immediate problems, and those with individual as distinghished frnm social objectives. Extensive work is necessarily loCal in large measure, and more than present available resources are needed to conduct the re— search which is basic to the local extension and teaching programs. 4. Local stations are usually not adequately enougi supported to undertake expensive research in problems requiring attack on a large scale or on a wide basis. 5. Relatively few research workers in lOCal stations are broadly enough trained to handle many types of problems. 6. Local research staffs are sometimes not strong enough to handle even the local asPects of some regional problems. 7. In some cases stations located close to central markets or inmortant consuming centers have staffs capable of undertaking this end of problems reaching into them; but never enough of a staff to :"-.4 1.0 407 handle all such problems. The state authorities do not ordinarily appreciate opportunities of this kind and develop their departments of agricultural economics as they migit. _8. There is therefore great need for research agencies which can operate over a wide area and on a larger scale; whiCh can also undertake broad national and international problems of agriculture. 9. Since many problems require joint effort of several stations, there is need forsome outside coordinating agency and also for leader— ship. It is clear that the United States Department of Agriculture is the agency best situated to supplement the work of the states in most of the respects above noted. ‘ There are several reasons, however, why the U. S. Department of Agriculture alone does not suffice. Toese are as follows: . 1. It is far from being sufficiently supported to do all this work. ‘ 2. In problems of broad economic interest, it is highly desirable that several different agencies work more or less independently in order that all approaches and points of view receive consideration. ' 3. There- is danger that a department of the Government will get the reputation of representing the point of view of the interests it represents and have its findings discounted in consequence. Too often such departments actually deserve such a reputation. There is need that some agency should be "attorney for labor," or "attorney for agriculture;" but it should not be a national research agency. 4. The U. S. Departmentof Agricultural is constantly under pressure for results that will be imediately helpful to some group. ' Under these circumstances, it work tends to lose much of its scien- tific quality and to become empirical. Problems of fundamental re- search valus tend to be pushed into the background. This is more true in the field of economics than in perhaps any other field. 5- The policy of the U. S. Department of Agriculture thus far has been to insist upon men for its economic staff who know the field and the commodity in 8- practical way; also who can establish good con- tactS. One does not want to decry the usefulness of such qualificap' tions for a large part of its present work; but if it is to assume as . one of its Paramount obligations also the developing of economic science, then it met select some of its staff primarily because of scientific training and ability. Unitl such a time as it changes its policy in this reapect - perhaps it ought not to change it .. there will be 4'08 great need of other agencies doing research work on more fundamental lines than it can pursue. The problem of division of the field between agencies and of cooperation between them is to be considered in a later section. Research The division of agricultural economics M which has thought through its problem from m: the two foregoing standpoints is in position to formulate'its research"policy and program. _ ,. The statement of policy which anew agency ’ formulates at the start of its operations is bound to be highly tentative; but still it should be made. Frequent inventories of policy and program should be made at later stages, so that any departures taken from former plans will be recognized and analyzed as to their desirability. Changes in personnel, in financial support, or in economic conditions, may often make changes necesSary in the policy and objectives of research agencies in the field of economics, but ,all important changes should be consciously made and for sufficient reasons. A wavering uncertain research program in any field is extremely wasteful of human energy. The broad general alternatives in the matter of policy which are Open to a research agency, are the following: 1. Development of the science of economics 15 attacking Specific problems of amelioration. 2. In the latter case, attacking local problems, as within a state, as against attacking sectional or national or international prob— lems. The first of these means essentially, in the case of state experiment station, the working out‘of‘a state research program looking directly to improving conditions especially within the state. ‘ ' ' 3. Research in private v_s_ public economics - in problems of adminis- tration of privage enterprises E those of public agencies. 4. Current and temporary problems lg those of long standing and. likely to "be persistent. 5. Controversial issues of the day, such as the McNery—Hausen plan. YE. problems of a more general nature. 6. A largely opportunist policy, choosing-projects in response to public interest in, them, or to suggestions from without, :15 a long-time program based upon a careful analysis of the needs of the field. 7. Cooperation with other research agencies in larger projects 15 independent work on projects not requiring such co- operation. “09 8. A policy looking to research results only 15: one which develops research workers in addition. ' 9. A policy which coordinates research with the needs of the teaching and extension program of the institution E one which considers research objectives only. ' 10. Whether little or much attention will be given to improving re- search methods. ' ' Obviously theseobjectives are not all mutually exclusive. A project designed primarily to meet a situation in one section of a state may at the same time'illustrate and clarify the application of ‘ an important principle of economica,- for e'Xample', it may shew how the principle of comparative advantage works'out undercondi'tions of lagging readjustmentsgwor it may developla hitherto “unused method of measuring lag. Manyl'states often have the'same problem, and a solu- tion for. one state "is of great service to the rest. ' ' ' Even '.though some of the foregoing objectives are rznitually exclusive, this does not prevent more than one of them being included in one policy.'.' For example, a program‘may include projects in the fields of both private and public ecdnbmi'cs: or both'lo‘c‘al and national problems. Neither does 81 Politiynneed-to ,be. altogether systematic or altogether Opportunist. It can be the £0 rme-r- and yet sufficiently elastic to provide for important emergencies, ., The policy which any division should adopt depends upon many things. A division in a new area with many different problems of developing a profitable, system of agriculture for. its state, can be largely excused if for a time it gives little attention to the needs of the science of esonomics. A division in a small state should properly consider itself as having an obligation to devote some por— tion of its federal supportrvto the larger regional'problems or to developing the science 'of. economics and its methodology. The more ample the support which the ivork in economics reCeives, other things being the same, the more attention should be given to the latter sort of problems. Divisions cf agricultural economics‘so located or developed as to have'attracted' strong groups of graduate students should plan their research programs fully as much with a view to training research workers as to obtaining direct research results. No condemnation can be too severe of an institution which is so narrow or blind that it cannot see the opportunity for service that -.~~\ such a. situation provides, and refuses to support 'its department of _ agricultural economics on this basis. Institutions with departments made up mostly of younger men without research experience should in 41.0 :13. general work upon narrower and more local problems. The same is equally true if the staff is largely made up of older men without special training in economics and statistics. A department with meagre support may find it best to concentrate on more local and immediate problems until such time as, following such a policy, it secures more ample support. On the otherOhandLif such a department has a staff of high-grade men well trained in economics and statistics, its forte may be to Concentrate on projects of a more fundamental nature, and neglect somewhat the immediate service needs of the local situation. With the foregoing by the way of introduction, let us con— sider some of the above alternatives in more detail. The science of economicshas reached a stage where‘caref‘ul detailed work with the be— havior of its principles under the greatly Varying conditions of time and place, field of production and Sphere of consumption, and institu— tions is its urgent need. Except in the sphere of cbrnamics, it is not probable that many really new principles will be stated; but nearly all of them need to be‘stated more explicitly, and also more particularis— tically for different situations and conditions. Only when so stated are they especially useful for those who have tasks of administration either in the public or the priVate field. The present is an era of "applied" economics, using the term a plied as Marshall used it in his footnote on page 37 of his 1910. edition: 1"” "Some parts of economics are relatively abstract or pure, because they are concerned mainly with broad. . general prOpositions: for, in order that a proposi- tion may be of broad application it must necessarily contain few details: it cannot adapt itself to par— ticular cases; and if it points to any prediction, that must be governed by a strong conditioning clause. in which a very large waning is gven to the phrase "other things being equal. " ’ "Other parts are relatively applied, because they deal with narrower questions more indetail; they take more account of local and temporary elements; and they consider soonomic conditions in fuller and closer relation, to'other conditionsof life.", In a word, then, reducing principles of economics to a working basis is the major task of the moment so .far as the science of economics'is concerned; and agricultural economics is in the posi- tion where it must make a larger contribution along these lines than any other ’division of applied economics, because the conditions or combinations of conditions and the situations it presents, are more," likely than the rest to be somewhat unique, Or to place an unusual emphasis upon certain aspects of the principles. At the same time, the Whole organization of research in :5in cultural economics in this Country is especially favorable for developments along the foregoing 411 lines. The resultsdesired are to be slowly formulated out of a large number-of studies organized on the same area or commodity or period—of-years basis with which we are already so familiar. To make these contribute to the ends desired, it is only necessary to give many of them a more scientific cast and methodology, and more of the points of vietr and objectives of the higher orders of re- search. Almost any such study, if planned and Bupervi sed by a man with graps of economic relationships and command of research technique, can be brought sufficiently Within the sphere of econ- omics as a science tO'add'a little to its progress. But the objective of developing the science of ecanomic's should not be entirely subordinated to the position of being inci— » dental to the more immediate and popular ends. Projects should be deveIOped whose aim is principally to discover the workings of Specific economic principles, or to'assist in a more explicit form- ulation of them, or a formulation more particularly in terms of agri— cultural conditions. The data examined will of course relate to a' particular period or area or commodity, and other results will be obtained which will be of service to those persons or agencies Specially concerned with this period or area or commodity; but this” service is essentially a by—product. "A similar statement can be made with respect to methodology as a research objective. A study vinich slavishly accepts the units and measures and forms of analysis of earlier projects contributes little if anything to progress in methodology. No doubt there has been more of this in the past than has been desirable for the long- run usefulness of the science of economics. It has probably been necessary because of'the‘newness of the field and the scarcity of men broadly enough trained to be able safely to prosecute independent modes of research. Pressure for immediate results of use to local interests has been another factor. An unwise over—emphasis on stand- ardiZation in research method has been another factor. Standardiza— tion in research is dangerous at any stage. It is particularly _ dangerous in‘ the early stages of growth, checking all progreSs and producing a stunted dwarf science. A handbook such'as this one might easily be, could do vast injury to agricultural economics by presenting definite rules cf precedure. Those responsible for it have had this though consciously before them at all times. Their wish is only to summarize the experience in methodology to the present, and present this as a foundation upon which later progress may be more effectively reared. ' In contrast to the foregoing, almost any project in econ—5 omic research can be so planned and executed that it adds something of value to research experience; It may be nothing more than further evidence as to the accuracy of survey answers obtained to a Egrtain 4ié H, type of question. ' Even this is in contrast with the too freuuent practice of assuming that because others have accepted answers to the type of question as valid, or even have proved them to be valid for their study, they must be valid for the present study. Those in charge of this‘handbook have been trying to get together the results of experience with all sorts of research procedures, but they have met with disappointing results. It is very clear to them that few indeed of those who have prosecuted studies have sufficiently tested the validity of their results, or have sufficiently analyzed their project from a methodology point of view to see what light it “throne. on methodolog in later projects of the same general type. Or even if they have, they have not thought it worth while to reduce their conclusions on the subject to definite form and publish them. In consequence, much of the value 'of many studies has been lost. A’ majority of the projects of the future'should be set up with a view to making contributions to methodology as well as those of the more usual sort, and the’ results in this sphere should be published when— ever possible, even if as nothing more than notes of a page or tm in journals. Needless to state, such material is of great value in the same institutions in courses in statistics and resaarch method. In addition to the foregoing, projects should be set up whose objective is primarily the development or testing of research methods. . An example of this is U. '5. D- A. NO. 1277, "Input as Related to Output in Farm Organization and Cost of Production Studies", Some of the price analysis projects have been ofthe same nature. So far as research devoted to serve local needs is concerned, the trend today is strongly tOWard relating it to a definite long-time program of continuing adjustment to changing economic conditions as set against the background of the physical environment. ll‘his has meant, and wisely in most cases, a more careful survey and analysis than had previously been made of the physical facts of climate, soil, topography, systems of farming and, the like as a basis for later and more essentially economic analysis. It also means setting up machin- ery or making such other arrangements as necessary. for securing the data of economic change, of acreage, production and yield of crops, of prices of farm products, wages, prices .of machinery and supplies, amounts of labor, fertilizer and other supplies used, of market ‘re- ceipts and shipments, taxes, land prices, rents, etc. , For all of these changes, suitable measures must be established and kept up— to—date. Out of an environment thus analyzed, it is easy to select that area or commodity where special studies in more detail can be most adVantageously executed. Carrying through such a, program as the foregoing may be more of an undertaking'thsn is possible with the financial support provided in some states. It surely is not more then should be possible if available state and federal funds are properly appor— 4:13 tioned as between the soonomic and other work. If the funds are not available, the best procedure is probably to mark off a pcrtion of the field — certain commodities or certain areas or certain divisions of the field of ecanoufics - for special analysis. Under such cir- cumstances, many states will do well to omit projects in transporta— tion, public finance, Credit, land tenure, etc., from their research program. Research in matters of national policy, in public adminis— tration as it relates to agriculuire, in taxation, railway rates and the like, can ordinarily be handled best by those institutions in which the department of agricultural economics ozr cooperate with a general department- of econonlce In the Same institution... lic-E‘iciencies in this respect my be partly overcome by working closely wi th the U. S. Bureau of Agricultural Economics. There are largely local as- pects of some of these problems — taxation, for example — which can be has: led effectively on this latter basis. Also any division of agri- cultural economics may have some member of its staff who is Specially trained in one of these fields. If it is the duty of state agricultural institutions to aegist with the handling of temporary problems or take a hand in the controversial issues of the day, then it is the function of the re- search staff in soonomics to c-ollcct and analyze whatever date. are aVailable and pertinent. Work of this kind should not be turned over to extension specialist, but should be done-in close coopera— tion with them. If accepting such a responsibility as the foregoing takes an unduly large part of the time and energy of the research staff in eCOnomics, then the institution is ei titer concerning itself too much with such problems; or it is not supporting its department of agricultural soonomics sufficiently. Most heads of departments of agricultural economics will. testify that if they concerned them~ selves with a third of the current and controversial issues that arise, the whole energy of their staff would lie-absorbed by it, and there would be little left for teaching and more vital research. ‘The policy of training research workers in conjunction with actual research has serious limitations as well as advantages.- Carried very far, it involves a tremendous burden upon those in charge of the project. But where one trained and competent leader is in charge of one, or at the most possibly three projects, these not running in the same phases contemporaneously, So that he can give close supervision to the details of each, then the two func— tions can often be combined effectively. The personal. equation is an important factor in such a question. Some othonwi se compe- tent research workers have single-track minds and can think of only one project at a time, and follow closely nobody else‘s work except '41}; their own. Others benefit from directing more or less parallel 8tUdieB' Research must be learned in large part by doing. In many fields of StUdY, about all that is needed is to turn a graduate student loose in a library and let him work out his own salvation. The research in economics that promises results these days is mostly with data. Often these data must be collected at great expense. The necessary calCUl8~ tions with the data are often so laborious that graduate students can perform only a small part of them except at a great waste of their precious time in school. Training graduate students for their work with research in economics therefore requires either that they be set at work on some practice exercise, or that they be closely supervised on real problems with real data. The former is suited to formal courses in statistical method; the latter to research semdnars and to theses. Any department of agricultural economics which expects to given men their final training for research must work out the technique of combining research training with actual research work. Experience in collecting data is the easiest to provide. It is with the analysis that the real difficulty comes. So much in need of improvement are the curricula in agricul# tural economics these days that it may well become a major objective of the research work of some departments for a time to develop material to be used in certain courses and ultimately in text books. Since the manor emphasis in teaching economies should always be on principles and analySis, Such an objective need not interfere in the least with research aims of a high order. If the teaching ideal of the department, however, is mainly to supply the students with data, a different statement might be needed. No discussion should be needed as to the desirability of a research program that provides the basis for a broad fundamental exten- sion program, both based on the sane careful outlook analysis. ~91§§§l£iggfiign A division of agricultural economics 93;§§§§§§E§ , also needs to have clearly in ndnd some REleggfii classification of its field as a basis of organization of its research program, The classification which the Advisory Ccmmittee has adepted or its analysis of scope and method, is the following, which is in terms of the fields of agricultural economics as these stand forth in th9.minds 0f workers 15 the field and reflect themselves in the names 0f 05133636818316. researc}; projects: . . . f'Oihrm Management-' .Market (including cooperative marketing) 9 V . Prices - -, I : s 415 Land Economics Consumption Economics (including rural living) ‘ Transportation (including roads). Credit Insurance Taxation Rural Population Commodity Studies (combining production, marketing prices, etc.) Geographical Studies (including area studies combining produc- tion, marketing, etc.) 1 Historical Studies Agricultural Income National Agricultural Economy (including esPecially national production programs, the tariff in relation to agriculture, relation of agriculture to industry, immigration.) Outlook Studies (foregoing focused on the problem of fore- casting impending developments.) It is assumed that most of the projects relating to market prices will be included under Prices. Consumption Economics includes research in factors influencing consumption as well as the consumption of rural people themselves. It is realized that some aspects of rural living will more properly belong under Rural Sociology. The same is true of Rural Population. The category Commodity Studies has been been included to take Care of projects organized at- some stations in which all aspects of one comedity are studied. Geographical Studies include,among other things, area. studies, in which the area. instead of the commodity is the base of organization; National Agriculmral Economy of course includes many aspects of other classifications, but as used here is meant to provide a'place for projects of this nature not otherwise taken care of. Research Outlines .is soon as possible after research by Fields. work is undertaken in any one of the fore- going fields, a research outline should be prepared for it to serve as a guide in ' mapping out projects and keeping them in proper relation to each other. As a guide to making such an outline, there is here submitted a sort of generalized outline for one field, that of farm management: 416 I Description of farming in a given areas, or of the pro- II III duction of a given commodity. The description of farm- ing indicates the products produced and in what amounts and proportions, and the methods used in producing them. Sometimes called "type of farming" studies. Determination of what products to produce, and in, what proportions, in order to secure the largest net income. Also sometimes called "type of farming" studies, the difference from the foregoing being that here the ob- jective is not merely to describe the type of farming, but to determine which type of farming pays best so far as combinations of products is concerned. Projects under this head should include types and qualities of each product as well as different species of product - e.g. , Grade"A" vs Grade "B" milk. Determination of how to produce each product at lowest cost Consistent with highest net income to the farm as a whole. Projects under this head are Variously named. Many are called projects in "farm practices". It will be impossible to give an outline of them that is either Complete or entirely logical. Let the follow— ing be interpreted rather as a list of types of projects coming under this head: A. Those relating to choice of the cost factors - 1. Whether to use'tractors or horses, combines or binders, etc. 2. Whether to harvest with amachine or with man—labor. 3. What type of machine, horse—labor, men- labor, etc., to use. 4. What Varieties of a crop, or breed of ' livestock, to use .to turn out a given type of product. 18. Those relating to choice of Operations: disking Igplowing for small grain; hogging off y__s_ husking corn; stacking YE. shock~threshing.' 0. Those relatingto time of operations: when to plant, when, to fatten,_' etc. D. Those relating to the quantities of various factors, such as labor, seed, feed or fertile 417 izer, to use. Applied'to labor, such projects take the form of determining‘how many times to . milk per day, how many times to cultivate the corn, to spray the apple trees, etc. Such projects have recently been called "input- output" projects. IV. Determining the most advantageous size of business, or. s00pe of. Operations, for farmers of various capacities, and in general under a given set of economic conditions. V' 1 Determining all of the foregoing with respect to a 1 _. . particular product, usually called "enterprise'fstudies",a "or, "comodity studies" , although the latter usually 'in— ‘ clude more than theproduction aspects only; '_' VI I Regional or area. studies - in which all the foregoing ‘ ' or a considerable part of them, are determined for the major types' of farms and products produced in a given VII Outlook studies, in which attention is focussed upon . the changes in all or some part of the foregoing which are neceSSitated by changing economic conditions. VIII Determining the most advantageous layout of the farm and farmstead and farm buildings. Ix Determining the best leasing arrangements. Relationships Another important organization problem between research which an economics division needs to con- in economics and sider is the relationship of its subject to in other 31330390135: ' ’ other subjects handled in the same instituo- (a) the natural ‘ tion. The distinctionbetwem economics sciences. and natural. science hasalreadybeen made. It remains to. show how; research in economics and the natural sciences, more particularly the applied natural sciences, the cultural and husbandry courses,” are related, to each other in actual research problems. PrOb‘ably the best way to discussnthi‘s-is to take a number of cencrete'cases.~ Let jus'start With the so-called intensity studies, in which costsr’and profits are related to the inputs of certain factors and the resulting‘outputs. Research in this important field seems to be held in abeyance at present because no understanding has been reached as to the relationships involved. Clearly it is the task of the natural science departments to conduct the eXperimental Work in which output of milk is related to input of feed, or yield of cotton to input of fertilizer, or amomt of irrigation water to yield of crops. 415 Clearly it is the work of the economics departments to collect and an- alyze the necessary information as to prices of the input factors, prices of the products, farm organization and relationships to-other" crops and liVestock'enterprises, and to estimate the probable effect on fan: receipts and expenditures and the farm business as a whole, of different combinations of the input factors. Both sorts of an- alysis are needed for a complete answer-to the question. The curves of output per unit of varying inputs which are needed for such analysis can be derived either experimentally or statistically. EXperimental methods will give the results in purer form. .Statistical method is not likely to be able to measure enough of the varying factors, or measure them with sufficient precision, to give very high accuracy to the statement of effect on'output.' On the other hand, the statistical method, under reasonably favorable circum- stances, cen include types of variables than cannot-be introduced into an experiment, mhich exist under actual conditions of'production. The results obtained may therefore more nearly fit actual conditions than those obtained from controlled experiments.', Moreover, we must not exaggerate the accuracy of many experimental results.. It is seldom possible to control all the'conditions. Measuring-the remaining factors does not help greatly so far as one experiment‘is concerned because not enough cases are included to make it possible to isolate the effect of differences in them. The results obtained can be very accurate for the conditions under which the experiment was conducted, but how they fit other conditions Cannot be determined from the experiment. A statistical study of fertilizer effects on a large area may take acCount-and measure the effect of varying rainfall; an experimental study gives results with one given quantity of rainfall. If the experiment is continued from year to year so as to include different amounts‘of rainfall, then diseases and pests or growing season may not be the same.~ From the foregoing it~appears that the experimental and the statistical method are valuable complements of each other. The re- sults will not always check;~but this merely becomes an occasion for further work on the problems; 4 by the_experimentalists to see iflthe special circumstances under whiCh their trial was conducted did not give these particular results; by the statisticians to see if vari- ables not included have not modified their results; or Whether thier measures or data are not defective." To illustrate the latter, if pasture feed is omitted from a_statiStical analysis of steer feeding, the effect of rate of feeding grain'may be modified greatly by'the. fact that the slower the rate'of feeding, the longer the steers run on grass. ' ' ':” H ' " " . If. - 419 The economist needs his results from such studies in the form of curves so that hecan locate the points of lowest—cost and highestuprofit combination and so that he can estimate the effect of different rates of input on gross and net farm income. The ex- perimentalist can give him these results in the form of curves so that he can make his sstgup accordingly. But this is Something which the experimentalist has not been accustomed to do, and only with difficulty in most cases can he be made to change his procedure. It must be admitted also that it is not aIWays say for him to arrange his set-up in the necessary form. But vastly more can be done in this direction than has been done in the past. Also there is no reason why the natural science departments should not use the statist— iCal method, which gives the desired curves directly, as a complement to the eXperimental method. Those that have a prejudice against the statistical method must lose it and learn the necessary statistical technique and its underlying philosophy. -In many cases, astatietical attack on the problem may be the next step needed. ' In many cases, no doubt, the natural science departments will prefer to confine themselve to the experimental method, and will gladly have whatever statistical work is done on the problem handled by the economics staff.- But the economics staff will make a failure if it does not have the'assistance' of the natural science departments at every turn.* - ~ ~ One might analyze in similar way the problem of combination of enterprises. The effect of different crop rotations on yields and soil maintenance are problems for the departments of soil, agronomy, horticulture, etc. The economist must have the results of such studies in order to complete his analysis of the effect of different combinations on gross and net farm income. All research in the economics of land utilization must lean heavily on information as to soil and other physical conditions that . must be accumulated by natural science departments. Similar analysis is also needed as to .the relation between temperature, growing season, rainfall, soil, etc.,_ and the growth and yield of the different crepe. But the final responsibility for statements as to the agricultural possibilities of .a section should rest with the economists, for only they can have made sufficient .analysis of markets‘and trends in compe- tition and prices to "hazard the forecast involved. It will not be safe to have a soil specialist whose forte is peat soils make predic~ tion as to what is going to happen to the peat land. of his state in the next fifty years. But he has the most important contribution of any to make to the final answer to such a question. The problem of * For further mderstanding of this type of researCh problem, see Wisconsin ResearchBulletin No. ,79,"Practice Responsible for Varia- tion in Physical Requirements and Economic Costs of uiilk Production in Wisconsin Dairy Farms". ‘By iii. A. B. Ezekiel, P. E. Norvell and F. B. Morrison. 480 land classification in the final stages is an economic problem. “Marketing offers an abundance of problems in which economic and natural science analysis are closely related. ‘lhe early research of the U. S'. Department 'of Agriculture in the field of marketing was in problems requiring natural science technique mostly — grading, sorting, packing, shipping, storing, etc. Economic analysis of market- ing problems was largely introduced when Henry C. Wallace becama . Secretary of Agriculture. The present, staff of the Bureau of Agricul- tural Economics engaged in marketing research includes both economists and men who know their eommodities. I' The two need to work together on many projects - grading, grade price differentials, marketing channels and margins, problems of marketing business units, etc. (b) Relationship 1 i " ‘r Agricultural economies has close rela— to other social ’ ' tionships to political science and sociology. sciences. , _ I , In fact, so close is the relationship that in many institutions whatever research is done in these two fields is‘ ”as part of economic proJects. The task of administering the affairs of a township or county is usually considered a concern of the political scientists. The polit- ical theory of the function cf the state is basic to the whole pregram of regulation and service in the interests of agriculture. But there are aspects of it which are most certainly economics. I For example, whether it is the best use of‘the social incomefof a township or county at a given period to devote part of Iit to building a town hall or bridge or improving a highway, or whether to use the funds in one .of these ways or let the farmers have the money to pay off their mortgages, are essentially economic questions. Assessment of farm real estate involves economic analyses principally. The relationship between agricultural economics and sociology is best discussed when the term sociology is defined rigorously so as to include only group analysis— group structure, group behavior and group functioning. ‘ It then becomes a basis science like psychology upon which applied economics draws for the solution of many of its problems. In turn, sociology is able to draw upon the principles of economics for an explana— tion of many things in the structure, behavior and funcioning of group8. The migration of rural population to the city, for exemple, must be ex.- plained in large part in terms of the economic principle of comparative astntagefI The differences between the rural and the urban family are partly based on economic factors. Exanples of economics making use of sociological data and principles are found in- studies in farmers' attitudes toward economic issues, membership relations in cooperative organizations, farmers' movements, the economics of rural conswrption. Much of what is included under the term "rural life" is more largely economics then sociology. The content of rural living is essentially an economic phenomenon, the result of the efficiency with 421 which human resources and the resources of nature are utilized in the productive process on the one hand and in consumption on the other. The essential idea of soonomics is utilization. The principles of economics are the principles which explain how to utilize human and natural resources so as to secure the maximum satisfaction of human Wants from them, satisfaction being understood to be equivalent to human welfare as Pigou uses the termin his "Economicsof Welfare". There are many things. about rural life, however, which cannot be understood without a proper understanding of group behavior and functioning. The standard of living, for example, properly under— stood as the content of living we strive for rather than that which we sxperience,‘ must be explained largely on sociological grounds. The sociologists as yet have made little study of standard chalking as a proper sociological concept. Much of the confusion as to the sphere of sociology comes from an erroneous concept of certain values in living as being social as distinguished from economic. Economics is a social science, and the values it contemplates as ends are social values, that‘is, human welfarevalues. Values in the purely sociological sense, if there are any such, are values which relate to the functioning of the group, as such. . ‘ Psychology stands in the same general relation to economics as sociology. It provides the basic explanation of human behavior which is needed for an analysis of many economic problems. * (c) Relation to The esonomics of home economics as a home economics field of teaching and research is about research. where the economics of agriculture Was in 1910. Here and there departments. of home economics in agricultural colleges are be- ginning to realize fully that the household side of the farm family economy has about the same range and types of economic problems as the income-producing side of it - its problems of economic organization, layout of work, financing, marketing (buying), etc. In some institu- tions, the department of agricultural economics is taldng an active part in developing the work along these lines; in others not. That part of it which relates to the way in which the farm income is ad— ministered as between farm and family uses, as between present and future, as between major divisions of the budget, is in many institur- . tions considered as economics of consumption as set over against the" economics of production of farm products. ‘ * Davenport' 3 definition: "A level of consumption so fixed.that any falling short is felt as a privation‘. .....r' 422 The line of distinction between the economics of home economics and the rest of the work in home economics is the same as was pointed out for agricultural economics and the applied-natural sciences of agricul-s. ture - the economics of home economics concerns itself vi th the value and price aspects of things and the orgeniZation of productive resources of the home in such a way as to secure maxinnm utilization of them. It begins like agricultural economics with certain resources in Capital goods, labor and current income, and aims to secure the best possible utilization of them, values based on alternative uses and prides being the Special desiderata. ‘ ' It 'will be apparent at once that many of the research problems of the eccnomics of home economics require a combination of economics and natural science analysis. The most conSpicuous examples of this are in the buying side of the household economy.' Questions as to pro- portions of the different kinds of foods to buy, what types of fresh fruits and vegetables, vfnat sizes of oranges, what classes and grades of eggs, what kinds and cutsof meat, what types of‘dress goods, or what types of floor or wall coverings, require a high order of under- standing the facts as to the physical or chemical properties of these things and of the needs which they are to fill, as a basis for the final economic choice, in which these considerations are set over against price and the Various alternative uses of income. Once studies in rural living get beyond the point of finding out how farm peeple spend their incomes and attempt to develOp a better understanding of the principles upon which such Spending should be done, the analysis of the technically trained home economist will become as necessary as that of the technical agriculturist in inputs-output studies with feeds and fertilizers. Rural living studies should soon pass out of the descriptive phases in which most of them are at present. It is highly signifiCant that a very considerable proportion of the Pumell projects in home economics related to problems in tech.— nicel home economics. - Congress in passing the mall. Act assumed that home economics Was ”economics. What they undoubtedly had in mind was the problems of rural living in which economic considerations weigi heavily. That the home economists have not seen-‘thematter in the same light as Congress is clear evidence that they have narrowed their think- ing down to the technical or natural science aspects of things, just as the agriculturists had done in the long period from the establishment of the first agricultural college until the change came perhaps ten or fifteen years ago. EQL'EM lbrestry has its economic problems no forestry less than agriculture and of the same general rgsearch order. But thus far they have received little attention. There are few trained "forest econ- omists" - men who have acquainted themselves with the technical aspects of forestry and are also trained in economics. The outstanding problems 423" of forestry in the United States at present are economic, of determining, in view of the prospective prices of land, labor, capital and the other cost factors.on“the..one hand, and the prospective prices of timber pro— ducts on the other, what kinds of woodlot and forestry enterprises are .v liker to be‘profitable in the next fifteen to one hundred years. There is only a limited amount of information available that helps to answer questions of this kind. Many of the leading thinkers in the field of forestry are aWake to this situation, and funds for‘the expansion of economic research work in forestry are being vigorously sought. .Relation of Research It is equally necessary to have a clear to other functions - set of ideas as to the relation of the re- search to the other functions - teaching, (a) To Teaching extension, etc. , It should go without ex- planation that the teaching in any subject can be no higier plane than the research. In anyone department it may be on a higher plane as a result of making full use of the research results of others; but not for all departments combined. Many teaching departments are to be criticized for not following closely enough the important researches in the field of economics and incorporating their results and analyses in their courses. One of the reasons for this is that‘the methods followed in the met significant studies are often be- yond-the reach of many of the teachers. This will be remedied as more and more of the teachers have adVanced training. Inadequate training in economic theory prevents many teachers from giving a sufficient an— al'ytical content to their ocurses. This is no doubt the greatest weak- ness of the present teaching in agricultural economics. That agricul- tural students must be trained in power to analyze new economic situa- tions as they arise, rather than suppliedwith rule-of-thumb procedures ‘ andsv‘anescent facts and data, has already been sufficiently emphasized. But facts are needed as well as analyses. Teaching should never unless absolutely necessary be in terms of hypothetical data. , The ‘ more realistic the data the better, the more nearly they describe the environment from which the student has come and to which he will return. The material for the problems of analyses that are set-up should be actual facts and data. ‘ There are a150, other facts and data, sepecially. those of a phySiCa1_ sort, which are sufficiently permanent in their’nature so that the student should'be taught them thoroly. In this category are the facts as to marketing channels for different farm products, the . curves of output per unit of input for the different input factors in the region where the students live, the usual conflicts between enter- prises in the use of labor. The student whenhe leaves the cwnpps should also take with him an understanding of the salient facts in the present environment, even though some of these are highly impermanent. Without this he will not be ready to put his hand to the pic?! in his first undertaking. But even more important is it that he know about the current sources of data of economic change and be taught how to work them into new analyses of each year‘s operations, of each new economic situation. On the side of agricultural economics which relates to ques-. tions of public policy, progress is restricted by a lack .of an adequate understanding of the facts as to the economic organisation of agricul-~ ture, and the economic condition of farm people in general and in differ- ent section, as to such public questions as those pertaining to taxation, the credit needs of agriculture and their proper provision, "land utiliza- tion and land tenure, the structure of land values, the relation of tar— iffs and immigration to agriculture. It is highly important that the facts as 'well as the analyses relating to these shall be developed and taught to all agricultural college students. The required courses in economics should include this sort of content above all else. The foregoing analysis of the teaching needs of agricultural economics furnishes the b‘asis’ for a discussion of the research which is required for it. Strongest emphasis must be placed on research which will develop the principles that constitute'the tools of analysis, not the broad abstract prin‘tfiples of pure economics, but those Of the applied science,‘ as defined by Marshall, of agricultural economics. It is this most of all which is needed to give the courses in” agricultural economics solid content, to-save many of them from the charge of being thin or largely descriptive; Next in importance at present is research which will provide an adequate understanding of the condition of agri— culture and broad social queStion-s relevant thereto, Both of these needs require looking at research on a wider basis than a. single state. The research program confined within a. state, so 'far as teaching needs are concerned, should be aimed at collecting data'of a. cross-section ' of the different systems of farming in the state, of marketing organ- izations and marketing practices, of land utilization and tenure, of credit need and facilities,’ etc. , so that these can ’be used as material 'for analysis in courses, and to acquaint students with the irportant facts so that they will know their environment when they go out into it. It should. also giVe them data of changing loCal conditions which they can be taught to use in analysis. However, "if the local research organized along these lines produces merely facts and data, does not ermize these in relation‘to principles, the larger part of its Value is missed. The courses using results of such research will be as thin and a8 descriptive as those be generally condemned‘et: present. As already suggested, it is desirable at times that much of the re'sea'rch in some lines be built around the needs of specific courses badly in need of being improved. Only in this Way can the teaching in manylfields, notable among these marketing, be developed into something substantial. _ ' I "4:35! ".35 l: V (1:) Relation to The extension service of a. land gram” extension. institution needs to keep informed fully as to conditions within the state and to follow closely the changes in t‘: e economic environ- ment. More important still, it needs to understand. the fix of these, It is the job of the'researchs s t3 ff in economics to collect and analyzea 1d interpret all this inf ornatim and of the extension service to fit it into its current odu.cational programs. The research program of the economics department should be sufficiently elastic so that nhene1er en energ may situation arises, the extension division can call upon the research st;=.f1 quickly to assemble the necessary facts and analyze the . If the research ..taff is properly organized for collecting current informati on, it mll usually be able to supply most of t1 ~ information needed without sending men into the field. The practice of working out a state program is becoming in“ creasingly general. Mr. Marquis of the Office of Information of the U. S. Bureau of Agricultural Erronor'ics reports that in 42 states the ex- tension service made spacial a;1‘e:-._emen ts for disseminating the national outlook report. Sich prcg ’I’Uns .hould have a short—time or “outlook" aspect, and also a long— time aspect. Once such a program is adopted, it is the business of the extension service to present it to the people of the state, and of the research divisiOn to carry on the research fun- damental to it. This means a contimling program of research, an import- ant part of which is the analysis and. interpretation of current develop- ments, The researcl. staff should take a major part in the formulating of "both the longv-ti: anrf short-time features of the program. It is their records and their in terpretation of them which will furnish the basis for the outlook stateirents. Extension work is becoming much more frankly educational than it was. More use is being made of schools. One of the important func- tions of such schools should be to present the results of the most recent research projects in the same or other states. Extension work in the past has been to largely a matter of talking - of getting groups of farm people together in school houses :nd town halls and talking to them. Extension men have often been selected because of' their ability as talkers. It is now taking on ‘ more and more the nepect of programs, "drives" and demonstrations. A pro gram or "drive" is something that should be planned carefully in advmce, frequently requiring a certain amount of researches a. founder- tion for it. fiEl-fiflfl-QEE Regulatory bodies at the start seldom regfl atory ‘ realize the importance of a real understanding activities of their problem such as can be had only after Careful research in the foundations of it. . ibis has been illustrated many times in the experience of- the United States Department of Agriculture - of late, particularly by the Packer and Stock.- yards and the Grain Futures administrations. The men selected for these 1.5“.» . ”426 administrative roles are selected principally, as should be, because of their administrative ability. But it is possible for one to be a good administrator and yet appreciate the function of research and have some understanding of its methods. Of course all such adminis- trate rs would deny absolutely the charge that they do not appreciate the importance of research to their work; one has to look to the re- sults of their administration to discover the true situation. Often a. research department is set-cup; but for some reason or other, either because of the men selected for it, or of the tasks assigned them, it too often fails to produce any effect on administrative policies. (6.) Relation to fgt- The job of fact-gathering is _ _ gathering activities. most expeditiously and economically done when it is standardized and re- duced to routine. Changes are expen- sive and bothersome. New departures are alWays experimental. The research man does not recognize all the difficulties in the way of collecting certain information. He thinks of what he would like to have, and is inclined to assume that. the rest is merely a matter of perfecting the technique for it. Not analyzing the data himself, the fact-gatherer is not spurred on by the same ob. Jectives as the research worker. His role is that of assembling material for others to exploit. W , It is one thing to recognize the W ' - relationships between fields of work; (a) W it is another to secure the coordinated 321-933 effort of those engaged in the related -Wo fields. It may arise between them in . a spirit of pure cooperation; it may be forced upon them, with doubtful results, by an autocratic administration; or it may be accepted by the parties concerned merely as a matter of ex- pediency. . securing such coordinated effort quietly and painlessly is one of the achievements of a good administrator. Proper organizational arrangements will assist greatly to this end. This section will discuss some of these arrangements. Some institutions favor formally stated joint projects between departments; others having one department in charge of them, but the others helping out when their help is needed. Still another arrangement is to have separate projectsl statements prepared, but so coordinated as to give results‘in" the two related phases of the subject. Thus, for example, the animal husbandry department might outline a feeding experiment looking to providing an input-output series, and the economics department 9. project . collecting-and analyzing the economic data needed for combination (with the input-output data. No doubt each of these arrangements has its advantage *Professor 0. L. Holmes contributed portions of this discussion. and special suitabilities. Where one main objective is involved,"and lies Clearly within the field of one research division, the second. ar- rangement is probably the best. If there can be Said to be two definite parts to the. project, even though they be closely related, the first or third arrangements will work the better. ‘ In some institutions the economics departments find getting together with the natural science. departments on a joint project in which the contributions of each are properly recognized, a delicate problem. Gheyork in economics is new. In many eases it has taken over certain more. or less disputed territory, which previously has been considered the domain of various natural science departments. In not all Cases are the administrative officers of the institution fully ap- preciative of the legitimate scopesnd nature of the economics work. Perhaps if an inventory were taken of the situations within the various experiment stations, there would be found what an unbiases outside party would consider an entirelydnadequatej view of agricultural economics work and the extent, to which it needs .to be handled by persons specially trained for it. . But gratifying progress indeed has been made in this reSpec‘t in the last ten years. , g ,_ ‘ _ ferhaps.‘the mostfiifficult problem of all is to get natural science departments to modifymheir methods so as to get the results in form, so that economistscanruse them. {If the economics department fails in this effort, Iit become-ea responsibility of the director to take a hand in it. H H I H , In the case of newly developed ”lines ,of research, some import- ant questions of organization arise. In "the case of research in the economics of home economics, it is to be doubted if two entirely differ- ent departments of economics, one for the farm side of the combination and the other for. the family or household side, make a satisfactory or- ganization. A better arrangement is one‘ldepa'rtment with certain members of it delegated to work on the household ‘andfamily problems, these persons having received training in both economics and home economics. Another arrangement is to have some trained economists on the staff of the division of home economics who work in close Cooperation with the division of agricultural economies. ' " ' It is equally to be doubted whether a separate department should be set up; to provide. researchin the economic. or reg-"try. Projects of thisnature in land grant experiment stations should be worked out by conjunction of~the two departments. The men selected for it at the start-should-probably betrained economists who will have the help of teclmicalforesters at every turn, and who will gradually acquire a'workinghiowledge of technical forestry. As soon as possible, however, a considerable number-bf graduate foresters of special ability should take up graduate work in-economics .in general departments of economics, supplemented by certain courses in agricultural economics in order to get the points of view of applied economics. ' 428 (b) Research 3 The first administrative problem here and related is that of a proper distribution of emphasis functions. among the three major functions. In many institutions extension work has been pushed in advance of the research effort as a basis for it. Extension men have been appointed in many instances before any research has been done. The writer recalls 9. visit a. few years ago from two rather eager young men who had been appointed as extension men in agricultural marketing in a state where there had been at that time not a single project in marketing research started. They were seeking for suggestions and for material which they might use to make their extension work effective. In another state at the present time there is approximately twice as much staff devoted to extension work in cooperative marketing as to teaching and reaching combined in the same field. , (l) Reseggch and The important problem here is whether teachig. ' teaching and research should be in separate organizations entirely,carried on by men doing research only, or whether the same department head should have charge of both both and both are done largely by the sans staff. 1310 letter Wt is the more usual in the field of economies; but not necessarily the better. Some very capable teachers are poor research sorkers; and some good research men are poor teachers. Specialisation that will take advantage of these special abilities and limitations is highly commendable. But this amount of specialization is possible'without having separate administrations. It is always possible to have some of the staff working altogether at teaching or research. However, Profesaor Holmes, writing on this subject, doubts whether any one should specialize entirely in research or teaching. "A teacher will have his subject—matter vitalized and his contacts with students more stimu- lating' and effective if he is given time and funds for the prosecution of research. A limited amount of teaching, on the other hand, tends to de- velop a saner point of view on research and a more vigoroug prosecution of the project. It is impossible, of course, to give each research men a teaching assignment. But it would seem the part of good organization wherever possible to have each of the teaching staff’directly responsible for some research project and to expect from him a reasonable showing both as to the quantity and quality from his research efforts." It is highly pertinent question nhethervone who is poor at research is really a high- class teacher. He is likely to be a popular rather than an effective teacher. 0r his abilities may be wholly in elemtary courses. Unc- ' doubtedly most teachers, especially-of advanced courses, benefit greatly from research- experience. ‘ Some of their best teaching grows out of it. More important still, if the teaching is to' be vital, the teaching staff needs the close contact with problem and with conditions out in the fields which research gives. Nothing is more fatal than to have a. teaching staff getting all ideas from books. 429 Likewise some research workers are improved even as research workers by the experience of presenting their material to students. Research is likely to be narrowing. The workers confine their activi- ties more and more to special subjects, and presently find their use- fulness in handling these various subjects greatly reduced by their loss of vision of the larger field. Teaching a few courses which keep them in touch with the whole field is an excellent corrective for this. Re- search workers freely testify that many of their best research ideas have some to them in the classroom. ' The foregoing is especially pertinent in the field of economics. As will appear later, careful deductive analysis of research problems in ecOnomics is highly necessary at all stages, especially at the beginning and end. Working closely with concrete data year after year has a tend- ency to numb one's analytical facilities. This accounts for the tendency of the research work of research specialists in ecouomics to become in— creasingly fruitless in later life. That this tendency is a real one, such research workers themselves have abundantly testified. They are always looking for a chance to do a little teaching; to get away from their work for a year of study. ' The largest objection to having economics research and teach— ing ine one department is the difficulty of getting an administrative head who is sufficiently inclined toward both. But this difficulty is easily avoided if the danger of it is sensed in the beginning. If historical reasons have put a person in charge of a department who is poorly qualified to direct research. In such a case, the best pro- cedure may be to have a separate head for the research. Even the foregoing danger is slight in comparison with that of having a teaching and a research department which do not work to- gether harmoniously. In the early stages of a. department of agricul- tural economics, it is especially important that the research be planned so that it will provide teaching material. The chief administrative officer of an institution can bring the research and teaching departments together and keep them there; but it may take constant vigilance. ,(2) Resegch and It is probably more commn to have the extension, extension work in economics under a separate administration than it is the research work. Most extension directors feel that they must have a staff of men of their cm at all times available that they can shift from point to point in the state at will. The also like to have a staff entirely their own for the sake of the better esprit des corps that is possible and so that they Can infuse it with one set of ideas and points of view. If the members of the exten- sion staff are also members of subject-matter departments, they are likely to have strong ideas that they get from their associations in these depart- ments, and the result may be a less harmonious program. Of course this latter would not seem a. disadvantage to e. broad-gauged extension director. 430 The strongest case for an entirely separate extension staff is usually made on the score of specialization; that in the first place a man with special types of ability and personality is needed, and in the second place, that years of experience are needed to make a really good extension worker. It is commonly said that no one should be em- ployed as an extension specialist who has not been a county agent for several yearS. There is also a strong disposition for some extension divisions to be sufficient unto themselves in economic matters; for the director in particular to assume that he is fully conpetent to follow what is going on in the economic sphere and make the proper interpretations of current developments. An economics department which confines its re— search entirely to pure research or to social problems leaves the exten- sion service in the position there it has little else to do but attempt to follow and analyze current developments itself. As already pointed out, inadequate support may make it advisable to conduct the economic research along these lines. But any land grant institution which forces its economics department to follow such a pol-icy, is extremely unwise. No doubt in many cases the extension division is partly re- aponsible for such a policy. If it is recognized that its proper flinc- tion was limited to pure extension, that a properly sustained research staff in economics sould do a. much more thorough and scientific job of collecting the necesaary information, and certainly make a sounder in- terpretation of it, it would insist that the administration give proper support to the research staff in economics. That such is the ruling opinion among extension leaders is demonstrated by the language of their recent report.‘' One of the reasons advanced against such a policy by some ex- tension directors is that the research staff is not closely enough in touch with conditions in the state to make the proper interpretations. The answers to this are three: First, a properly supported staff would be able to keep in touch with conditions in the state. Second, to obtain this information the research staff would not need to be travelling constantly as are the extension men. They would collect it by more scientific methods than sketch interviews with people encountered in the field. Economic information cannot be observed growing in the field. It cannot be obtained from the opinions and attitudes of a few individual farmers or bankers. Even the county agent is often a very unsatisfactory observor of economic conditions. He meets farmers every day with re- actions as far apart as the poles. The choice which he makes between these reactions may easily be highly personal in its nature. County agents in adjoining counties often give greatly varying reports of economic conditions. What is wanted instead is records of conditions, and these need to be collected under scientific supervision. The place * See page 1. 4:31 to get many of the records is not at the farm, but in the market. Others of them can be obtained by an adequate reporting service. This is not to-bexi'nterpreted as saying that the research staff never needs to get out into therural districts; but most of its travelling will be in connection with the surveys which should be constantly in progress. One of the things that'needs to be surveyed is attitudes. What farmers think, and what they know about economic matters is basis to any effect— ive extension program. The first maxim of pedagogr is to proceed from the existing or known to the related new or unknown, Third, what strictly current information the research staff needs as to famers' attitudes and the like, it should be able to obtain from the extension workers if there is proper contact between them and the research staff. Another common arrangement for extension work is to have it done by people who are also members of the staff of departments of economics, but who give all or part of their time to extension activi— ties. For the subject-matter which they extend, they are responsible to the head of the department of economics, for their arrangemnts in the field, to the extension director. This means that the extension program for the state is'worked upby the regular staff of the land- grant institution, primarily by the heads of the departments, in council with each other and the special administrative staff of the extension department. The advantages of this plan are as follows: 1. The extension program is likely to be better founded and conceived because the strongest men in thevinstitution, who are at the same time in close contact with the research work, participate in working it out. Under the first system described, the extension director seldom has enougo help-from the economics department. Too often he does not feel that he needs it. 2. Incidently thereto, the reSearch and extension program are brought into line with each other. It is possible to accomplish this under the first plan or organiZation, but it is seldom achieved. 3. The whole institutionis behind the program because all are fully informed concerning it, and all have helped formulate it. Under the first arrangement, the teaching and research staff often know very little about What the‘ext‘en'sion staff is doing. 4. The teaching and research men, in particular the heads of the departments, are kept in' constant touch' with conditions and people in the state. Even if the do not get out in the state on extension work very mich themselves',"they have their offices next to those who do and are in frequent discussion with them. This makes the teach-o ing and research more vital and more responsive to the needs of agri— culture. At the same time, the extension men are benefitted greatly by frequent contact with the teaching and research staff. 432 To make this second plan more effective in this latter re- spect, 'it is-bnly ~ne'CeSSary to mix extension, teaching and research a little and' send members ’of the teaching or research staff out on ex- tension as'signrnen'ts occasionally. It is highly desirable that the men sent out on important assignments should be the strongest and best informed men in' the institution. Economics is so easy a sub- ject to go astray with, and economic questions are so" constantly in the foreground, that this becomes a highly impertant consideration. 5. This-leads to perhaps the most important advantage of all, that under the second plan the men who come to stand in the eye of the people of the state as representing the different fields of agricul- tural endeavor, are the strongest men in the institutions, those who either engage in research themselves as part of their work or are in close contact-with it. In contravention of the foregoing, it may be stated that the extension department may have specialists in economics and other fields who are also strong men. Conceivably, yes; rarely so, in practice, and for rather obvious reasons. First, there is not room in most land—grant institutions for three strong men, one to head. teaching, another research; and another extension. If one strong man is available for the three combined, the institution is fortunate, unless the institution' is one of the larger ones. There are reasons why the teaching and research men are likely to be the stronger. Ex- tension work in the past has not attracted its proportion of really able men. Most of the men of ability in educational work today have been attracted to it because of the fascination which scholarship has for them. Extension work has thus far discovered no incentive which is equal to that of scholarship for the usual person of ability. Those Special fascinations which it does possess are probably better rewarded in ,business, politics and jounialism. Second, the experience of being an extension specialist does not develop strong men, in the sense in which this term is used in the scientific world. A considerable ammmt of such emerience is recoog— nized as very valuable. But in the end it leads away from that careful study and close analysis of problems which alone makes for intellectual growth. There is a tendency for many extension specialists as the years go by, to pride themselves more and. more on being “practical", and to think more and more frowsily, until they are scarcely distinguishable intellectually from the common herd. This is"especially easy in the field of economics, where loose thinking is so easy and so common. That which will save them from this fate'is constant close contact with research workers, supplemented if possible by-periods of study and actual teaching and research work. As already pointed out, extension work these days is more a matter of working out programs and organizing them and sitting in coun— cil. The abilities and training required for this are much more nearly 433 like that of teachers and research workers than that required forthe old-fashioned farmers' institute'type of extension specialist. It will be freely granted that men who are to (do extension work regularly should have had county agent experience if possible. .' This is just as possible under the second plan as under the former. There are hundreds of experienced county agents each year looking for a chance to get into graduate work. The importance of the foregoing considerations is being increasingly recognized. One way of trying to meet them at institu- tions where the experiment station and extension staffs are separate is to arrange regular weekly or monthly conferences between the two staffs. More and more it is coming to be the practice at such in- stitutions to assign station men to an oceasional appointment in order not only that they may keep in- contact with the farmers' inter— est in the work they are doing, but that they may, in some cases at least, lead in a demonstration of the use of new material. At one institution, plans are being worked out by which the extension workers will occasionally be assigned for a certain period of uninterrupted time to actual investigations]. work on the progects of the experiment station staff. _. . . , (3) Research and _ ' Should research that relates to regu- re ate 1:: tory functions be delegated to separate activities. research agencies or administration?_ No general answer can be made. Analyses of ‘ - the fundamental sort should usually be turned over to separate scientific agencies, such as the research staff of: the department of economics of a land—grant station. That which relates closely to current administration needs in‘ nest cases to es done by research men employed by the regulatory staff. A ‘ very desirable arrangement is to have a capable research men as one of the regulatory body. He will know what it is best for his staff to undertake and what is best to hendlein cooperation withseparate research agencies. 7 - It is extremely easy to be unfair to regulatory bodies when they refuse to Open their records to outside agencies, even to departments in agricultural experiment stations. If they knew the limitations of our departments of economics as well as we ourselves know them, they Would be ev-en more unwilling to offer the use of their records. There should first be a high order of assurance» .that the proposes is well conceived, that the research men are com- petent to execute it, and that the results will be carefully re- viewed before publication. If regulatory bodies would only keep adequate records of their own activities, research in their problems would be greatly 434 aided. Frequently it is only when some competent outside agency begins work on their problems that they discover the weakness in their records. This is often true even when there is a "research department“ in the organization. By its records, you shall knowhow fully a regulatory body appreciates the research aspects of its prob-— lems. (4) w £111, Overcoming the difficulties in this Fact—gathering. field is a matter of organization. Arrange- ments can be worked out by viiich the research workers and the fact-gatherers are brought to- gether and made to understand each others ob- jectives and problems. The fact—gatherer should be made to understand that he is an important agency of economic research, and should be given much more credit and recognition in publications than is ordinarily the case. The difficulty that the data are not gathered with broad enough objectives, can be met by holding conferences of interested groups. This might also win larger support for appropriations for such mrk. Developments in the field of gathering current data are only at their beginning. Each new step forward in research reveals new data that should have been collected since the beginning. That the data have not been collected is often more the fault of the research workers than of the fact-gatherers. The research Workers have not known what they want. They do not yet know what they want or have not yet worked out satisfactory measures. But depar,tments'_of agricultural economics lmow of a great deal of data which in most states are not now being collected, or are not collected insufficient detail. Data of local prices, of land sales, of rents,‘ of taxe's,‘ seasonal marketings, movements of popula— tion, of unemployment,' disease infestations, ammts of labor hired and feed and fertilizer bought, sales of machinery and building material, bank loansm interest rates, rate o‘f‘payments on mortgages, land im— provements, yields,, acreages of minor crops in trucking areas, etc. No doubt the biggest problem is making arrangements for collecting much of this information. The first line of attack seems to be to expand the cooperative crop reporting services of the states by securing more liberal swport for them. (0) Research £1.11 ' There is also the question of the one department interrelations of the various problems within the field of economics. Where, forexample, does a marketing project leave off and a production project begin? What shall be the division of effort between the man studying rural finance and the one studying farm management with reference to such 435 problems as farm investment, mortgage policy, and the like? These and other problems of a similar nature confront the man who finds himself.- responsible for developing and carrying out a well—rounded research program in agricultural economics designed to yield a mass of product for teaching, for extension, for the guidance of business executives and farmers who are looking to us for aid, and for the general. enrichment of the bedy of information in this field. It is not doubt easier for two or more persons in one department to cosper- ate than for separate departments to cooperate.- It is easy to pub- lish the results of each joint effort in such a way that credit is received where credit is due. It is also easier for administrative authority to exert itself unobtrusively and lay out the work in such a Way as to secure coordination. Nevertheless, friction is not al- together absent even within departments. 4. Accrediting ' ' ' The first question to be considered results. here is what is authorship of a research '. ' ' study? It surely includes both the analysis of data a‘nd'th‘e preparation of the report of ' it. It is rather hard to separate these two. A man may do a considerable amount of- analysing and yet write not a". paragraph of the document in which the analysis is presented. On the other hand, it is hard to see how the man who actually writes up the material can avoid participating in the analysis itself. (Jon-inesiti'on~ is an essentialstep -4 the final one - in analysis. ' The mere process of fast-gathering, freely participated in by junior members of research staffs, by field men and sometimes by mere routine workers, can hardly be construed as authorship. That their work should be acknowledged at least by footnote reference is obvious. But often there must be a close relationship between the gathering of the original data and the analysis and presentation of it. Much of the s'i'giificance of the raw material of certain studies must be conveyed through the impressions received by the field men or other original investigator. Where such is the case, he should be credited forexactly this sort of a Contribution. A more difficult Question is whether the chief or assistant - chief of a research section, fer example, shell put his name upon a. publication rarely because he has conceived the problem, has edited the manuscript with more or less minuteness, and given general direc- tion to the prosec‘tion of the study and the preparation of the manu- script? It seems to the writer that this should ordinarily be answered in the negative. ‘ on the whole, it is profloably, the part of wise ad- ministration to be as libenal 'as pOSSible in the granting of credit to those vino are taking care of the details of the work in responsible fashion. ' , . , In this connection also arises the prObleIn of joint author»- ship between members of the some staff. ‘ It seems the part of wisdom ‘33 436 F to keep authorship as separate as far as possible, particularly for the sake of the young and inexperienced research ‘wrorker whose reputation is yet to be made. The appearance of his name along with that of a man who has already published may mean but little to him in the way of recognition. ‘ The credit in the mind of the reader will almost in- variably‘go to the senior author -- the man as has already demonstrated his ability to study and to write. It is frequently possible to divide the results of a comprehensive project on which several men have worhd into separate publications, each bearing the name of the tone of the worker who has'contributed most to it. - Where separate departmentxfiork on'the same project on .a joint basis, only some form of a joing authorship will meet the situa— tion, unless separate publication seems possible. Relationships Besides the relationships to other to resegch in departments inthe same institution are other institutions . those to outside institutions, to other land grant stations, to. state departments . of agriculture in the same state, to the U. S. Bureau of Agricultural Economics, to research institutes, etc. First of all is the possibility of a vast amount of duplication. Second is the possibility. that one agency will attempt to do something that another is better placed or organized to do. Third, that import- ant gaps will be left in the research on certain large problems, or the converse of this, that agencies well situated to handle certain phases of problems will- not be called upon to do it. Fourth, that there will not be team work on problems requiring it or benefitting greatly from it. - . Let us first illustrate this in the case of the land grant stations. If a large group of experiment stationsin the Cotton Belt all undertook to analyze the broad national problems of orderly market- ing of cotton at the same time independently, there would be a great deal of duplication. They would all be seeking about the same sources of data and going through about the same analytical steps. If, however, One of them best situated for it'worked on the national phases of it, and the separate cotton states could be interested in collecting and analyzing data relating to important local aSpects, then there would be no duplication, and in addition a preper division of‘la‘bor with probably no gaps in the program. If the several. stations were in addition to plan their local attack so that the data Could be combined. then there would be team work in addition. Stations with central. markets close‘at hand can carry on much of the research in these markets more economically and understandingly than those at a distance. However, for them to be responsible for all such research is altogether too much. Moreover, in somecasesit is highly advisable for those working on marketing problems in the producing territory to follow the product in- to the central markets to see how it is handled there. 437 Duplication. is less likely in economic research than in most other fielder Local differences‘are often one of the important as- pects of economic research. I A study in one state often" needs to be repeated in several others in order to bring out all the aspects of it. Thus, for example, the Organization and management of country grain elevators needs to be studied separately for regions handling mostly wheat, mostly corn and oats, or different combinations of coarse and bread grains; also for regions where handling supplies is an important adjnct; and for regime where buying and selling, hedging and storing practices are different. Problems of intensity in the use of factors of production need to be repeated in all important areas and farming systems in which a given crop or kind of livestock is raised. Out of these numerous studies will finally emerge larger generalizations that will be of use to all ‘ , On certain new and difficult problems, such as some of those in price analysis, it may be excellent general research policy to have several agencies working at the sometime. Cdmpetition has its value in research as in other fields '- if it is not overdone. One group of workers is likely to have a particular approach so clearly in mind that it overlooks the right one. The relationships in research with the U. S. Bureau of Agri- - cultural Economics are too generally recognized to need much discussion. The most important aspects of it relate to division of labor, to having the Bureau handle the. phases of it that it is best qualified for because of access to important data,- or facilities for collecting it. 'Any pro- .ject requiring data over a- large part of the United States needs to be carried on by or in close alliance with the Bureau. Likewise projects requiring foreign data. , The other; important a8pect of this relationship is that the Bureau is in the best position to sense many of the larger research needs and henc'eto' assume leadership in getting projects re- lating to them under way; The. Bureau is also ”able to help many of the newer and smaller departments with research-methods. Other aspects of this problem have been sufficiently dis- cussed under an earlier head. . Rglgtionswith' ‘ . ' ‘ I .Ooeperationwith such agencies as W ' - ‘ . . institutes" of research is largely'a matter AME-fl. . of division of labor in the field of re- ‘ ' " -" ‘ search, already discussed under another head. ' Some of the broad problems upon which research institutes work may lend themselves to cooperation with state institutions which are in close contact with certain aSpects ofthemv Some of the most difficultflp'ro'blems 'of joint-effort in. many states center around state departments of agriculture of markets. The 438 statutes creating such‘departments sometimes limit their sphere of activities so as to- exclude research, but usually they do not. Even if they do, there is nowhere any definition‘of research which is so clean—but that it jean be uSed for settling questions of boundaries. State departments need research for their regulatory duties even if they do no" research for its own ends. All, that'has been said earlier about the relation between research and regulation applies in this connection. In view of this,‘ it is highly desirable that cooperation be arranged whenever an urgent problem presents itself in which both agencies are interested. In other cases, the simplest solution of the problem is always to take up with the other body any new venture in research to make sure that there are no overlapping plans. W’ The preceding sections have discussed W3 the relationships between the research act- 1- M" 1 ivities of various agencies and the needs ham and circumstances of joint effort in re- Ml . . . search. i‘his section will undertake to show how this joint effort can be best re- ' alized. ’ The conclusions are based on the experience of a considerable number of workers who furnished reports and suggestions. ~ - In general, there are four bases upon which joint effort may be arranged: (a) coordinate departments or institutions cooperating on a more or less equal basis, (b) one department or. institution in ’ charge of a given project, but calling upon others forrelated phases of it, (c) an outside agency, such as the U. S. Bureau of Agricultural Economics, acting as leader in the project. (d) a superior administra~ tive authority exercising a sufficiently detailed and rigorous control over the whole project so that the various phases of it, althou$1 handled by different agencies, are held in line with each other. Which of these four works most satisfactorily depends upon the personnel, the project, and the agencies. Cooperation is the loos'est of these arrange- ments. In general, purecooperation will work when the common end to be obtained by joint action weighs larger in the interests of the workers than the individual ends to be gained by competition. one may recognize that certain important projects require joint action; but there is always the possibility of finding some project less important socially or scien—. tifically which may yield larger returns to the individual performers, personally or professionally. In its pure .fo rm, cooperation assumes agencies on the same-plane working together on a. basis of equality and sharing in the results in proportion to. their contribution. . In its pure form, however, it .seldom exists. What passes for it instead is w having an agency taking the lead in the project and the others helping out whenever asked to do so, credit being given in proportion to the * By Professor C. L. Holmes, Iowa State College. 439 contribution. The next step beyond this is when an organized attack is set up in which one agency or person is definitely in charge of the project, and the job is definitely divided into tasks which are assigned to the "cooperating" agencies. It is obvious that as soon as one person or agency steps into the role of director, the basis of pure cooperation is lost — the cooperators begin to look upon thepro~ .. ject as somebody else's in which they are helping. The specific problem before us is the exact procedure which may best be followed in formlatinga project in which two or more cooperating institutions are eXpected to participate. Hc'm go about it to have each of thecooperating parties make its best contribution to a common enterpise? How avoid having the project made one-sided or inadequate by one or another of the parties failing to assert its due amount of weight in the councils? One of the first objectives in the getting of a good project is leadership. The first question asked should therefore be: "Where is the best leadership for this enterprise, either within our own staffs or elsewhere? How can such leadership be brought into service, not only in the actual carrying out of the project but in its formilation?" Several of the correspondents stressed this point in their replies. One of ,the first things implied in the development of a project of research on a cooPerative' basis is that of conferences. Conferences, if rightly conceived and carried out, with the background of prOper preparation, are absolutely essential to the development of successful}. coOperative programs. On the other hand, much >w1uab1e time may be wasted and poorly conceived plans may result from confer— ences not preperly prepared for and carried through. ‘To be successful, conferences must be limited in scope and have their objectives clearly defined in advance. It is also important that each participant know definitely in advance the specific things to be considered and devote some real thingain to the problem before he enters the conference. An exchange of ideas almost invariably results in a. clarification and a more comprehensive conception of the given problem than could result from the work of a single individual. 0n the other hand, the expression of vague notions and ideas is a most effective means of Wasting time. Many of the correspondents expressed themselves in favor of Conferences: To quote from a Middle lestern department head: "Policies and programs should be worked out by conferences attended by representatives of each state and of the Federal Bureau. If necessary for such conferences, the Federal Bureau should provide travel funds and bring representatives of states together at convenient points for these conferences. This appears to be necessary since many states either restrict or prohibit out—o,fg-state travel by their station workers." H ’ ' a In states in which little work in this field has been done, it has been necessary that the Department of Agriculture assume leader- ship and furnish the initiative. Many times this was necessary be— cause the Department of Agriculture had a program of its own, the ad- euqate carrying out of which involved work in states inwhich but little or no work had been done and in which, for the am. being, adequate, leadership was not available. There is no question but that great good has come from such a procedure, but-occasionally in the corres- pondence shots of criticism is detected. Probably the point in. the following is well taken: - - "Ihe only criticsm that I would make of such cooperation is that when someone in Washington gets an idea of a. certain kind ofwork, this work is , - started all through the country because it is sub- '- sidized and receives a. prestige based on finances 'rather'than on merit. ‘ Itithen comit-s a. lot of states to a lot of Work which’may not be the most important work in that state at that time. Twice I have‘seen the cost accounting fever run through the country. ‘ If I recall correctly, practically all of themany cost . routes in the country were in part at one time supported by the Department of Agriculture.-——-— Only one or two states were doing this kind 'of work without a subsidy. "‘-—--- .‘Mnoth’er objection to cooPeration is the fact that plans in Washington change so frequently that work is no more started before some other’plen develops.3‘ ‘ It should be added that with the expanding needs of the out- look work of the Bureau ofpngricultural Economics, and the increase of state funds through the passage of the Purnell Act, the federal Bureau haswisely withdrawn most of its financialsupport to state projects, particularly to those localized wholly within a state. But there is ~ still possible the same participation in planning of research. During the last ten years, which period has witnessed the most of the growth in c00perative agricultural economics research be- tween the Federal Department and the several states, the Department , leaders sought to systematize their cooperatiize relations in'sev-eral I"Du E. W. Allen's cement on.this statement was that "It is only fair to point out that the activity of the Department Was to a considerable extent in response to pressure from the states and "Congress", that "the states had the cost-accounting fever too." 441 .. parts of the country by the establishment of the so~called "research councilflx" The most outstanding example of. this type of cooperative: procedure is. the New mgland Research Oouncil.--I This :Gmmcil is com- posed of representatives of the agricultural economics work in the '13 Various New England states and maintains a;- central office in Boston; ' which is in charge of a. representative of the Federal Bureau of Agri- cultural Economics, who is designated .as its secretary. According to a letter received from the present secretary. of this Council, the pro-'- cedure in outlining work is as follows: , . ' .. . "'Ene policies and program of the Research Council Council are formulated by the following steps: first, they are discussed with the research specialists in the various colleges and by representatives of the Bureau of Agricultural Economics. I gasped,- a program is dram up in a tentative form by the secretary of. the Council. Third, this tentative pregram is again discussed with I the various interested partiee and changes are made from time to time. .- Fourth, this meg-am is presented at the annual. meeting of the Council which attended by most ofI the research Imen in the Economics Departmnts of the colleges in New England. " I’ II 6.3. .. Both connendatory and; somewhat critical cements have been received as to the success with which this Council has realised its . obJectives. Qne criticism was to the. effect that the work suggested had been of Is. superficial rather than fundamental order, that the conferences had been too brief and had gone too little into the de— tails of the preb1omsI which it was proposed to stuiy, and that the publications which have. resulted from the series of projects which . have been carried out ass. result of the work of the Council lack many features, of good research pullications, and include certain recomendations and conclusions xio't warranted by the limited facts obtained in the investigations. .- -: 4 -..; . In contrast to: the foregoing, another leader from the New. England states speaks in highest terms of the effectiveness of the New England RIeISBaIrch CoxxeciL. . He. says: 1 , ,. .-. "plans and possibilities of cooperative action and. the possibilities -of coordination Ief activities of different states ,are dismissed in the meetings of the GoI'ImIcil. It does not foster projects in the senso that it attempts to superimose regionwide pregects on the members of the ICIIO‘tm- ciIII. . It is not madame, to a highly centralized co- , ' operative but rather is comparable to e. cooperative organi- zation of Ithe federated type with control largely if not ' entirely in ”the hands of the constituent m‘eziber's' ahd with; the Research Council working along those line which the' members agree in conference are lines to be followed. By 442 5 “general agreement, however, if the executive secretary of the Council knows that three or four states are particularly interested. in a definite problem - we will say milk marketing orgmiization or taxation - and if he finds that it would be agreeable to the workers on these projects or the administrap' tors in charge that a. conference be held' he" is able to bring such a Conference about usually at the suggestion of one or more of the interested parties. ‘ ‘ ' ”The executiVe sedretary, es a competent representative of the Department of Agriculture, is also" able to act as a liason officer betweenvthe Department and each of the states or all of the states interested. in a particular project. "It is not the policy of the Council to foster research projects or to try to enlist the states in particular project. Where the representatives of the states in Council matings are agreed that a. particular project is likely to be developed in each of the states and that the project would profit by an inter—state division of labor - as was the case in the apple studies and in the dairy studies ~ then the executive secretary is instructed to give all possible assistance to help coordinate the work of the different states along the general lines 'of the policy laid down by the Council. "The reasons for nuccess of the Council, I would say, are as follows: "1. 'A definite spirit of cooperation exists among the members rho recognize the advantages of‘ cooperation and coord- ination of effort. ”2.176 have been very fortunate in having executive secretaries who recognize the necessity of each state's main- taining its administrative autonomy but at the same time who recogxize the value 'of such an organization as the Council. "3. The Council places at the disposal of all its members the facilities of the Department of Agriculturenvhich it has been difficult for us to comand particularly because ‘ of lack of close, first—hand contact with the Department." For a few years, an organization .similar to the New England Research Council was set up and maintained in the Middle West with an office at Chicago. Later another was sat up on the Pacific coast. Both have been abandoned. No doubt an important reason for their failure is that the states in the West are large enough to be able to operate by themselves more effectively, so that they feel less urge to cooperate. For several years,‘ the Department has maintained the Ibod MarketingLResearch Council in New York City. The objectives of this organization are in general similar to those already outlined, but it restricts itsinterests to a narrowerra'nge of problems and to a narrower area.“ " . 443 There is little doubt that the federal Department may be the most promising agency for leadership in getting the various state in- terests tog-ether on a research program covering fields of work which obviously'are regional rather than merely state-wide in their scope and ramifications." One state man writes of this leadership as par- ticularly evident in a type of study in which grape growers, been growers, apple growers, vegetable producers and poultrymsn are in competition with each other in several adjoining states. Production, marketing and other studies need to be made simultaneously in the different competing areas so that farmers in the different areas may adjust themselves more quickly to competitive conditions. In some cases, this means that a. certain area will discontinue producing a certain'product. If~it is to be driven out, it is” to the distinct ' advantage of farmers to know that at as early 9. date as possible. Another point is that certain problems cannot be studied in their entir'ety‘wi-thin'-the:'~border_s' of agiven' state. For this reason, studies limited to asingle state are often misleading and fuitle. Another state man has this to " say' on the point: "While I Was in the Department of Agriculture and had an opportunity of observing the farm management work in different states, I was deeply impressed with‘the‘possibilities of at- tacking problems On a regional'basis on the assumption that a great many economic problems have no relationship to state lines, but are confined to productive areas, and that much geater pro- gross could be made as far as both research results and extension activitiesare‘concerned if the problems could be attacked on a regional basis." _ , , , , _' The Department is to be warmly commended for its earnest and persistent effortsto promote COOperative research. On the other hand, We should franklyrecogaize the difficulties involved. State lines seem to be realities of a most stubborn-nature when it comes to getting tOgether in affairs involving research and educational activities. One man says: ' ' "The individuality of workers in institutions is too great. ‘ There is a frequently sub—conscious ”attitude of hostility in the, 1 way of. direction or coordination in a department." This section of the handbook c-an‘consider no more important subject than that of working out effective means of cooperating with . one's neighbors in the deve10pment of long-time programs of research ~ ' which have to deal with problems common to several or many states in ' all of which commendable efforts are being made to study the economic problems of the state's agriculture. The situation is not as hopeless as might be inferred from the very limited success with '-it thus far ‘- but it is nevertheless difficult. Cooperation is this as in all things first has to be learned, and then it becames a habit. The conservatism 444 I which seems to be manifested in many states is not entirely without justification. Inevitably in a field so new as our own, many false starts have been made. Ideas and policies in research have changed rapidly. Too much of the research of the last ten years in this the field. One type of cooperative effort relates to regional surveys preliminary to regional programs. The flepartment of Agriculture has promoted or shared in nearly all of these. Some of them have already yielded results in the way of publications. Others have not yet reached that state. The correspondence contains a number of references to'such work. rIhe following is quoted from a state leader whose section is participating in the Northern Great Plains regional study: "The particular study which you refer to is the one which has given me personally most Concern, namely, "A Study of the Northern Great Plains Agriculture." I am. not at the present time satisfied with the organization of this project and I think that this is pretty much the judgnent of the other states concerned. I have stated that it seems to me this is partly a. project in studying how to organize and ad- minister regional projects, and we have a number of details to work out in that connection......... The difficulties, however, are not from undue dictation by the Federal Depart- ment—- if anything it might be considered a little bit the other We . I think we might put our finger on the trouble by saying that no one person is sufficiently reaponsible in this project and the duties of the various individuals are not sufficiently clearly defined. Also, some of the mech— anical details of administration need still to be worked out. "This project was first taken up as a. reconnaissance survey and in ,the conference following this survey the plans for the long—time project were developed. We have no criti~ cism of the Opportunity given the various states toexpress their views of the Various problems. It is just possible that there are too many people concerned in this particular project in that it involves both the farm management and the animal husbandry sections of the six cooperating agencies, which means .twelve people besides the station directors to be censidered." This particular comment stresses the point already em- phasized, namely, that in any such cooperative program the first essential is able leadershipwithhsuch leadership well defined and understood by all the parties. ' Another of the state leaders has this to say with reference to the same projectzg’ 445 "It is my opinion that the Bureau at Washington has greatly improved its methods of procedure and its relation- ships of'late; in fact the work as now conducted makes pos— sible a greater volume of output than formerly and has the adVantage of pooling the brains of the Department with those of the States, which is desirable in my cpinion.§ Following is a report from another state on'a regional study of cooperative elevators: "This project has worked out, I believe, very wall; Ibch state has attacked the general problem and taken up one or two specific problems for intensive study :dthin its state. I be- lieve there have been annual neetings of the research workers on the project, and I know we feel that as far as our state is concerned it has been very successful and the interstate and department relationships have given us far more returns for our research dollar than Would have been the case had we been going it alone. It has been my observation that the principal in— terest of some states in the work of the Bureau is to get money out of them to supplement their own state funds, and that after agreements are drawn up supplying this money, these states some- what resent further supervisional activities on the par of the Bureau. It is my feeling that both the work of the Bureau and of the states has passed the promotion stage and that cooPerative projects should be based primarily on a frank recognition of the contribution which each party is in the best position to make." Another state man discussing cooperative relations with the federal Department of Agriculture says: "In the formulation of projects there is free discussion and c00peration between the leaders of the project representing the various interests concerned. Frequently the man in charge of the project from the United States Department office spends two or three days or a week in the state in consultation with the state project leader. Together they work out the details of the projects and prepare the final project statement. In' some instances the leader of the project in the state visits the Washington'office for the purpose of formulating a pro- ject. Occasionaly the project is formulated without a pee- sonal conference. In such cases there is free correspondence, criticism of project proposals, and reconstruction." Also the following: "we have had both cases, those of projects worked out in conference with state workers and projects formulated by the Department and presented as finished plans to the state. I believe both of these plans have their place, though, of 446 .J. course, in a majority of cases the plan of conference with the state workers is to bepreferred, I think. The Depart- ment can be of great assistance by serving as the means of bringing together workers in a given region for the working out of both projects. It is somewhat awkward and difficult for one state to take the initiatite in this connection unless it can do so through the Department. "I would say that it would seem to me that the states having in mind regional projects should communicate with the Department and possibly at the same time with other proppec— tive cooperating states. The Department might well then feel out of the sentiment of the states which should logically be concerned and, if this sentiment proved favorable, arrange for a meeting to plan the regional project. The direct supervision of the project should at that time be Specifically planed either with the Department or with some one of the representatives of the. cooperating states. " An important phase of project formilation is the embodiment of the essentials of the agreement in a formal project statement which stands as a sort of contract between the cooperating parties. The purpose of this statement is, of course, to serve as a memorandum of general objectives of the project and of the procedure which is to be followed in the carrying out of it. It has been the custom in the past to pay a considerable amount of attention to the drafting of this statement. Doubiless this is wise, inasmuch as it leads to more care- ful consideration of the fundamental Considerations lying back of the statement. A very significant movement in the direction of cooperation between state agricultural experiment stations and privage and semi- private organizations has been developing in the last few years. The more conspicuous work along this line has centered around the relations of eatperiment stations to farmsrskorganigations such as the Farm Bureau, the Grange, and the cooperatives or federations of such coopera- tives. There has also been a limited amount of research work developed between experiment stations and private comercial business. In many cases these cooperative relations have been extremely helpful, but in some other cases danger has appeared that these relations may lead to embarrasment and to injury to the long—time interests of research. Ob-‘a' viously very much of the success or failure of these relations depends upon the cooperative procedure which is employed in working out the joint program with such organizations. One or two basic considerations should continually be kep in mind. The first of these is that none of these organizations exists primarily for research. This is not their Special field of endeavor. It is only in exceptional cases that; anyone in such organizations has an adaquate point of view on research.‘ It is therefore important that the leadership and guidance of such problems be 44’? retained by the officials of the experiment station. The other point is that the experiment statiOn must guard itself against any charge of lending its effort to propaganda. for special interests. That this is not easy is proved by recent experience with the power interests of the country. ' Elie following comments from corr68pondents bear on the fore- going observations: "We have done a large amount of informal cooperation here. The Farm Bureau Committee on Marketing goes eyer the work of the dspar inert once a year all-:1 makes recozmendati ens concerning the research work in the fie} d?!- in which they are interested. 57.71311 recommendations are welcomes so heartily that farmers and. formers’ organizations have no hesitancy in consulting and recommending. Much of our research Work is planned after such consultations. Also the following from a leader in one of the northwestern states: ‘ "We conducted two marketing surveys in this 'state on which We relied on cooperation from the Commercial Clubs. We asked thme to Contriouto b0 th clerical wcrl: and enough money to show they were vitally interested. The chief asset in this experience was that it gave us a such better working basis with wholesalers and retailers than ifwe had tackled the job alone. '79 had. the Connercial Club appoint a committee which was made up. principally of broadmir.o‘..ed wholesalers and retailers, who were .h:.t«'~;i".ligent enough to lmow what we were after and that we were not spying and snooping around dark corners with the idea of bringing out some radical revolution— ary program. In each step in this procedure we held a con- ference With the cormittee and we would wait; a week or ten days before following up the line of work which was discussed by this committee. By so doing, We had coqlete access to the accounts and records of a number of firms that would never have given us anything but a pleasant smile had it not been for this action.". ' Another leader apparently has had a somewhat different experience. He Writes as follows: "It is not so satisfactory in my judgment to tie up with commercial organizations or with state political depart- ments.-~—--—There always is a danger that the leaden of the commercial organization'will try to gain a reputation for himself or that the state department will try to capitalize the undertaking in a. publicity my long before the time for definite conclusions has been reached." 448 Those who answered inquiries on this'subject had some rather definite suggestions as to the procedure in develOping research methods on a single project or in a pregram. Dr. Warren took the individual rather than the c00perative side of the question: "Most good research work is personal rather than committee work. Occasionally two men naturally work. together and ac— complish more than either one could alone, but ordinarily the best way to get research work done is to chooses. good man let him alone, giving him only such advice as he asks for. I do not favor trying to have work in different areas too much alike..... I "think each man should be encouraged to use his own ingenuity and carry on as complete and extensive a study as he is capable of doing rather than take any stereotype fonn and try to use it everywhere. I cannot Conceive of a good research man being willing to allow anyone else to completely dominate his work, altho any good researdh man nearly always welcomes suggestions from any one." Here again the-necessarily close relation between the federal Bureau of Agricultural Economics and the various states is stressed by a number of men. Professor Grimes of Kansas says: The federal Bureau (can perform real service in helping to develOp new and better methods of research. This is par— ticularly true in the farm management field Ifnere we are very much in need of a less expensive method to replace our statis-. tical routes. When a problem of developing method is inclosed, the federal Bureau is warranted in cooperating on problems that are distinctly local in charanrter and pertain to regions entirely within the state.“ The secretary of one of the research'councils says: "Methods of research are discussed with the secretary of the council from time to time and are also presented by Specialists of the Bureau at annual meetings." Another federal Worker states as one of the purposes of the research council movement: "To. insure desired comparability in the collection and subsequent interpretation of research data, the Bureau is by nature of its national aspect“ in position to aid in bringing about the desirable end." Most research workers would agree with Dr. Warren that there is grave danger to the progress of research in too much comparibility along these lines. 449 Further comment on the function of the Bureau in this con— nection by a state man is as follows: " In developing methods of research there have been joint agreement on most lines. On turo or three occasions the men in‘ charge cf the research work in the state have been called to Washington at the expense of the Bureau to sit in a tab or three day conference on methods of securing data, in the interpretation of data, and in the developing plans cover- ing methods of field work and of handling data. " ‘ Another state manjsays: "The work which is being done in deve10ping methods of research will, Ilthink', be the ”most helpful, at least to us here. l have a feeling that in an experiment stations there is considerable wastage’througi use of'inade'quate methods. . To sum up on the matter of cooperative procedure in develop— ing research methods, it would seem that the first and most important consideration here, as in other phases of our general problem, is that of bringing into effective functioning the best leadership that is ' available in the field. Where is every reason for keeping all workers fully informed as to the methods that have proved most effective in the hands of our best thinkers; but there is, grave danger that ill—Considered imitation may appear in the form of applying new and promising methods by inexperienced men to problems for which such methods are not adapted. COOperative effort in the form of frequent. conferences is no doubt to_ be encouraged; but the task of developing‘method is more largely an in- dividual. rather than. a cooperative one. Method results very largely fromexperience, and-experience develops primarily in the form of in— dividual effort. It is probably true that practicallylall of the im- portant advances in research technique in our field during the last ten years have come as the result of a long period of eXperimentation and study on the part of a few individual workers who perhaps from time to time have compared notes formally or informally. Impemw the development of a” given technique, such as the application of cor- relation methods to economic"pro'_blems, has been'givan through the publication from time to_ timef‘of' pioneer pieces of work. 'One specific proposal along this line is for the exchange of results among research workers to contribution. This proposal has some ticklish aspects, inasmuch as research workers are sometimes overs-sentive and over—anxious in the matter of securing ade'quate'credit for new. ideas. HoweVer, this should not stand in the way of the more rapid progress of research which should come from a free exchange of . ,ideas in the 'Various working stages. of the Carrying out of a. project. This is pointed out admirably in the reply to one of the writer’s letter of inquiry, "I believe that there" is great need for making . available to research workers unpublished manuscripts which may re— «L95 450 port careful work and certain unpublished materials which might be of interest to other workers." 3. In carrying-'J'I‘oit ' As a usual thing the cooperative pro-' joint 93 parallel cedure in carrying out projects is Forked projects. . . . ,out at least in its general‘outlines and specifically noted in the project‘state— . _ ‘ ment. ‘At best, however, these statements are somewhat general in nature and there is always the further prob- lem of the detailed workingout of cooperative projects in terms of just how much time this or that worker. shall devote to it and who shall havethe‘ work (of analysis and. of composition looking toward pub- lication of results. On cooperative projects with the Bureau of Agri- cultural Economics where field work is involved.‘ These arrangements have been well worked out and need little disaussion. Where a number of field men are employed, some sort of supervision is needed. ’ This is sometimes supplied by the Federal Bureau and sometimes by the states, depending very largely upon the extent to which the-state has already developed research work and has adequate leadership. - Cooperative pro- cedure in the analysis of data and its preparation for publishing is not, however, so satisfactorily developed, as evidented by-the follow- ing from a letter“ from a state leader: > ' " ' ' "Where projects are conducted couperativ'ely by the federal Bureau and‘ezperiment stations, much of the data has been analyzed in Hashingtom. . As a consequence the state workers have lost all contact with the analysis of '- the data, ' In my opinion this is wrong, as the state workers lose contact with the project and do not get the , benefits accruing through having'to'fatte the Various prob- lems that are bound to arise in the analysis of the data. Some arrangement should be perfected thereby representa- tives of the states and of the federal Bureau could work on the analysis. It muld-‘be equally as undesirable to have the states do all the analyzing to the exclusion of the workers from the federal 'Bureau. 'fliis point presents problems that. are not easy of, solutiOnu,‘ particularly where a state has limited personnel and. theithime- is divided between teaching, extension, and research." However, it is believed that a. much more satisfactory arrangement could be arrived at than has been followed in the past where the data for many projects have been collected Jointly and then analyzed in Washington to the exclusion of the state workers." No doubt the cooperativeoarrangement worked out for the analysis of the data has been responsible for unfortunate delays-in publication. Both the federal Department and Various of . the states have doubtless been at fault in this conditiOn. Many of the states have inadequate pro-'- ~ vision for clerical help, and their professional staff is held for too heavy a. load. The writer has in mind a project carried on cooperatively-e by a state and two Bureaus of the federal Department, the suggestionfor the cooperative work comingf'fron the Department a": lgrictu ture, The 451 field wail: imlved 3000 farmers which meant the work of a crew of six men through aperiod of about eight weeks. The field schedules were taken to the repartment of Agriculture for smalysis. ‘There developed some difference of opinion between the tWo Bureausof the Department as to the feasibility of publication. The result is that this rather heavy investment which was made in 1923 has thus far yielded no tangible results either for the state or for the Department. Undoubtedly con- dit ions will arise under the best of cooperative organization which will eeus e occasional miscarriages of this sort. Coo: erative procedure in the future, however, should look toward the elimination of them as far as possible. Perhapsthe' best way that has yet been hit upon for'the an— alysis of material obtained from cooperative research projects is that which has been developed in recent years by the federal Bureau. It is that of sending into the state, upon agreement with the state author- ities, a. federal worker who will stay in residence for some weeks or months cooperating closely with the state men in the analysis and pre- paration for publication of the results of the joint project, There “is little doubt but that the resulting publiCation has been- stronger ' and that the beneiits accruing to both of the coo )ere ting parties are vastly greater than if one or the other had undertaken and carried through the preparation alone. The alternative arrangement, however, of having the state leader work on the project for a period in Wash— ington, has as much in its favor. 4. The UtiliZation The general aspects of the problem of the Been—its of utilization of the results of research flim- will be handled in a later section. Let suffice here a few remarks on the cooperar tive aspects of it. The most significant example in this sphere of activity in late years has been the joint of; ‘ fort in promulgating the federal outlook report. The details of this are too familiar to need discussion. Each ye r the states are taking hold of this extension project in more earnest. The federal Bureau needs more men when it can send out to participate in the state and regional conferences_in which the general program is adapted to local conditions and expanded into more detail as is possible for a smaller area. The regional extension programs that have been worked out at conferences between state and federal extension workers with research and subject—matter men aesisting in council, represent the cooperative mode of attack on the more long-time aspects of the problem. The federal Bureau's sending out representatives to sit with those of local departments of economics in important local conferences, is another type of cooperation in this field. 452 Dr. E. W. Allen, Chief of the Office of Experiment Stations, was asked to review this discussion of cooperative procedure. .Most of his comments have been incorporated in the text; but the following need quoting exactly: "Cooperation is only promising and profitable among those who believe in it and are willing to join hands in a cooperative spirit. A cooperative enterprise seeks to bring these together; some will never conform, either in letter or spirit. They are better off outside."—--—~ "Thus far our cooperation is not very effective. Uh- leSs there is a plan with definite purposes to be served by a confer— ence, it will become mainly a talk-fest. Talking things over will not suffice. Something should be crystalized out of a conference which will make the cooperation more effective."—--~- "Cooperation in re; search is not designed to stifle individual initiative or subordinate workers to mere routine. Participants oughTto have an individual part in the research - not to be working for some one nho is to do the re- search and interpretation." Overhead organization* The subject of econondc organ- gg affecting research ization of the land — grant institup in agricultural economics: tions in the broad sense of departs 1. Scheme of organization. mentalization and lines of authority and responsibility, and interrelap tionships between_divisions, is one meriting careful-analysis. But it is out-ofhplace here except as it relates to the research in agricultural economics. All these institutions of course have three lines of work - teaching,~research and extension; and it is proper to Say that these are the three main functional departments in each institution. If for no other reason, the sources of funds are sufficiently distinct so that separation of the work along these lines becomes necessary; But here the uniformity stops. Table I shows 5 different schemes of overhead direction of these three fields of work. First are 8 states in which one person is head-of all three lines of work. This really means that the head officer of the institution, usually called "dean and director" in such cases handles the major administrative details of cash of these lines of work. He probably has an assistant dean of the college, and an assistant or vice director of extension and of research to look after smaller adminiStrative details. There are twenty in which a separate person is called "director of extension" and handles all the details of administration of the institution, subject, of course, to general. supervision of the superior officer of the institution who.is usually either a president or the dean and director of the other two fields. In 3 institutions it is the dean of the college who is given this more distinct status; in 3, the director of the station. Finally, in L4 institutions, all three divisions have distinct heads, who of course are subject to the superior officer of the institution, a president or * By G. ?. Forster, North Carolina Agricultural College. 453 .Ta'ble ,1. - Relation of Overhead Organization to combination of Functions in Economics Departments. : :Not Overhead Organization :States in Ylhich Economics Depart- :Classed :’ ' :ments Handle Different Combinations: Nature of iName of State :of Functions : Organization z : : :' : : : =Res.Teaching :Res.&' :Res.& :Res.: 1. _ : = Extension :Teach. :Egn‘fi. : : _ ne Office}; : Alk.,Del.,F1a., ‘_ :Fla.*111., Ky." IArkJ.‘ lFla.’ my: Ida. Deer: of Coll. : Ida. ,Ill. ,Ky. , mils. mel. : : : Dingpf Sta., : Penna” Wis. : :Fla. : : end‘Lir. of Ext.: 3 1190111151.: : : Tm Office_rs_ :‘Ala. ,Ariz. ,Calif. , :Ala. ,Iowa, : Calif:Vt. : :kia.“ (l)Dea.n of Coll. :Iova,l§ans. ,I.a. , :l-Zd.21-£ass.*tio. , :Minnu": : : . H and Dir.of Sta. :Md. ,Mass. ,2.[5.cn. ,liinn Mont. ,Nebr. ,N.J. :Okla. : :i‘uiich. :Mo. ,Liontaflebr. ,N.J. :S.C. , Wash. :17. Va. : : : (a)Dir. of mt. :Ol:la.,S.G. ,Vt. [El-3.811.: :Tiyo. : : : :I’Jfla. £330, : :Kz-n. : : : Two Officers : ‘: ‘ : : : : (l)Dir.of St. :Ind., N. H. , Utah :Ind., N. H. :Utah : : : end Dinof Ext. : : : : : : (2 )Dean of Coll. : . z : : Two Officers : : : : : (l)Dean of Coll. : Me., N.C., R.I. : :Me. = : : R.I.*** and Dir. of Ext.: : :N.C. : : (2)Dir. of Sta. : : :4 : : Ehree foi.9_e_r_§ :Colo. ,Conn.Ga. ,Miss..: : : : : (1) Dean of Coll. :Nev. ,N.Liex. ,N.Y. :Colo. ,Conn. , :N.Dak“ :N.DakflMisszGa.*** (3) Dinof Ext. :N.Dal{. ,Ohi090re. , :N.Y. ,Ohio, :S.Dak.: : :Nev. (3) Dinof Research:S.Dak., Tenn.Texa8 :Va. : z : :N.Mexicc fife. : - : V : : '3Texas* *Separate departments of Agricultural "No Research "“No Department Economics and Form magem‘ent 454 dean, who is consulted before any major decision is made. Even this classifiCation'does 'not'pro'vide for all the differences in detail. Minnesota, for czample, has a dean of the college and a director of short courses, - Many of the above differences are also merely paper mleifferences. A person called director or dean in one institution may have no more responsible position than an assistant director or dean in another. . . The second part of Table I attempts to show the functions ,. assigned to the economics departments and whether there is any connec- tion between this and the overhead set-up. It appears that there are 2;. economics departments which handle the work in the three lines, teaching, research and extension, subject to the general direction of the deans and directors. This direction is necessarily much more detailed in the case of the extension work, in many cases covering nearly-everything exeept the supplying of the subject-matter to be extended. It Will be noted that there are only two other states in ' which economics departments handle extension work, Vermont and North Dakota. This means that in the remaining 16 states in the second column the extension division has its own staff for the economics work. There are 16 other departments which have research and teach- ing combined. All these numbers count farm management, marketing and the like as separate departments where they are so organized. In 36 states the economic research is all in one department. Only three stations have economic research in a separate department from the teaching. From the foregoing, it would appear that the peculiar organization of institutions has had little or no effect on the way in which the department of agricultural economics is organized. In those states which have but one executive officer in charge of all lines of activities, the agricultural economics department in four is administering teaching, research and extension. In those institutions which have two officers, there were 12 in which the department of agri- cultural economics administered research, teaching and extension. In institutions in which one dean or director is directing all three lines of work, and the economics department handles all three functions, the problem of lines of authority and responsibility is very simple. The only question is when to call upon the assistant dean or director in each field and when to call upon the chief officer. Probably most matters g) first to the assistants. In four cases, AranBas, Delaware, one of the economics departments in Florida, and Pennyslxrania, the extension work is outside of the economics department. Then the anninistrative problem is to keep the economics extension in line with the teaching and research; but since the chief officer has all‘three under his direction, this can no doubt be accomplished, al— though it may prove difficult. In the 14 institutions where there are three separate func- tional heads, the 5 departments of economics having all three lines’of ~ 455 work have to keep clearly in mind the difference, between the three 'fimc-— tions and the relations between them mdrknow when to call upon each head. lhe analysis vfnich this entails is extremely valuable from an organization standpoint. The three coordinate heads probably watch closely how well their functions are being performed within each depart- ment. In 4 of these 14 institutions, the extension department has its own staff in economics, and in another case extension is combined with research. With a separate extension directors-ith an extension staff in economics of his own, the difficulty of keeping the two in line and the extension work on a high plane so f or as subject—matter is concerned, hasproved will weigh msurmuntable in the usual practice. Any kind of a two-officer arrangement is some kind ‘of a combine;- tixgn of the two foregoing. In only 9 of the 26 states with such separate economics staff, and in 3 of' these states, the direction of extension is also director of research or, dean of the college. In the few cases in which research and extension in economics are in separate departments, the ranking member of the staff is likely to be only a sort of nominal head, most details being handled directly between the staff members and the director. A condition resembling this often maintains even when research and extension are in the same department with teaching. The military type of organization in which the staff members deal only with the head of the department, and the department head only with one superior officer, or one for each line- of work, is not carried out rigorously in most institutions. The various members of the staff may in fact not negotiate many sorts of details with the head of the department, but with one or more ofthe three functional executives, depending upon the nature of the work. That is to say, in case a staff member is conducting research he may go directly to the director of the eXperiment station for consultation; or on the other hand, if he is engaged in teaching he may go directly to the dean of‘instruction, and if engaged in ex- tension work he may be responsible to the director of extension for many sepectsof his work. It is always assumed that the department head is informed as to what is going on. No doubt the type of organiZation affects the research work in mny ways which cannot very well be reduced to chart or printed form. . There are certain evident advantages of an organization which has one all executive officer who has achieved a standing in the educa- tional world. Such an officer can exert a greater influence on the nature of research and also coordinate it to- a better advantage than two or more executives all interested in promoting their own work perhaps without due regard to other but related types of work. As to the matter of functional- relationships and departmental organization, one of the deans who has had wide experience has this to say: 456 "I am strongly of the Opinion that the closely knit ‘type of organization is the most desirable. I have worked under both forms in different institutions and certainly feel ' that no subject—matter department can be a. functioning depart-- ment unless it has some stronger tie knitting it together than the tie of subject-matter. It is my considered judgment that most subject-matter departments headed by a. chairman who gets all of his salary from one subdivision simply cannot function in the best possible way, for where the treasure is, there is the heart also." ' The type of organisation of the research agency probably af- facts also the departmental program of research. The recent survey of the economic research in agriculture conducted by the Social Science Research Council showed that it was uncommon to fund a unified program of research covering the entire field of agricultural economics and in- floating the relationship of the various projects and functions to each other. While this might not be traced entirely to the type of organ- ization, yet. there is reason to believe that a closely knit organization would realize the need for a comprehensive program. The type of organ- ization no doubt affects the relationship between research in economics and other fields. The foregoing discussion has been entirely in terms of land- grant institutions. The same principles will apply to any other research organization that is departmentalized. 2. Allocation 91‘, funds It has been recognized for some 1:2 economics. resegch time that agricultural economics research has not beenedequately financed. Unfor- tmately the land-grant institutions while in many instances realizing the importance of this type of work, were not in a. position to emport it financially. One difficulty arose with the limitation placed by the federal administra- tion upon the federal funds made available by the original acts establish— ing the land-grant colleges. The Hatch and Adams Acts were for the most part interpreted as referring to natural science research. The stations have therefore had to depend upon state funds or other funds for the pro- motion and development of agricultural economics research. And. as these funds have been rather meagre, the growth ,of the work in this field has been seriously retarded. Lack ‘of properly trained personnel has of course been another limiting factor; but personnel tends to develop as funds be- come available. Fortunately, however, the Purnell Act passed by Congress in 1925 has made available funds which can be used wholly for economics and sociologfitfit'g‘tes so desire. One barrier, then, which has so long interfered wit the development of this work has been removed. 457 As the Purnell fund has been available for the past two years, it is interesting to note its effect upon the amount of funds being de- voted to agricultural economics research. In 1924—5, the year prior to the pamage of this Act, the state institutions Spent $355,859 in this field. Most of this caste at of state budgets or out of the budget of the Bureau of'Ag-Dricultural Economics. In 1925—63 the states spent $608,000 for agricultural eCOnomics research, an increase of $352,161 or of 70.8 percent. rI'he total increase in federal funds received Was $906,000. Of this, $398,838, or 74: percent, except the small amount that went to home economics and. rural sociology, went to old—line research or Was otherwise absorbed in institution budgets. In the year following, the funds for this field were further augmented, the total being; $758,781, or an increase of 116.0 percent over 1921-5, and 26.4 percent over 1935—6. fliedncrease- ’ infiederel funds to the 50 institutions Was $480,000,0f which $311,219, or 65 percent went elsevmere than to agricultural economics research. One of the ways that the increase in federal funds is absorbed appears in Table II, which shows that during the period from 1924-5 to 1926-7 there Was a diseppeamence of state or other federal and miscellan- eous funds of $541,745; that is, when the Rimell funds actually spent in 1926—7 are added to the amount Spent in 1924—5, the total is $833,527 as against the actual outlay of $768,732. From this it mind appear that state support to agricultural ecorcmics research has been withdrawn as the Pumell funds have been increased. The actual displacement has been much more than these figures indiCate. In Ohio, for example, a Special state appropriation for research in agricultural economics swelled the total of state support. In institutions where teaching and research are done in the Same department, the staff has had more time for re- search with the reduced undergraduate enrollments since 1925; and this has resulted in an apparent increased allocation of elependiture of state funds to research, although state support has been actually decreased. One of the most obvious ways in which state support is replaced by Purnell support is to pay the salary increases of those doing both research and teaching entirely out of Purnell funds. But state sup- port of agricultural economics budgets has been actually reduced absol— . utely in many states. Up to 1328, nearly two thousand dollars of state . support had been taken away at Minnesota. Other states can show a. worse record. rfable III classifies the states according to the amount of funds spent on agricultural economics research in 1925-7. The picture is some- what confused by the fact that there are two departments of agricultural I economics in some states. In Spite of the passage of the Purnell Act ' ' giving federal funds and to all state regardless of size, there is still ‘ a decidedly unequal distribution of financial support for agricultural economics research. Comparison of the last two columns in Tables xv and v shows that change have occurred in this respect. First of all, there was an increase from 35 to 46 states doing research in this field. In 1924—5, 3&3 percent of the states spent 8.0 percent of all the fluids spent in the 48 institutions on agricultural economics research; in 458 Table 2.- Displacement of State Support by Purnell Funds in Agricultural Economics Research from 1924-5 to 1925-6 .1. .2. .4. -4- .5. .6. .7. . Funds in P11111011 Total Funds Decrease or Increase No. State 1924-5 1925-«7, 3 and 4 1925!? In 111p£airom 517.5112. 1 Ala. - 7,400 7,400 7,500 {r 100 2. Ariz . 3 Ar; 1,100 ' 18,450 19,550 19,550 - ,.4 _ Calif. 20,000 15,892 35,892 27,000 —8,892 5 ' 0010. 10,000 12,200 22,200 13,700 -8,500 6 ' Conn. 6,500 16,000 2.2, 500 ‘ 16,000 45,500 7 Del. — 5,800 5,800 5,800 - 8, ’ Fla. - 16,050 16,050 16,050 — , 9 Ga. 4,000 —- 4,000 4,200 «5 200 10 .Ida. - 5,594 5,594 5,807 +. 215 11... 111. 25,855 - 25,955 47,818 55,985 +-5,157 12 z;d. 9,887 5,800 15,587 15,500 +-1,915 15 ’IQWa 52,800 5,590 58,590 40,240 +-1,85O 14 Kane. 1 6,500 12 , 891 29 ,391 16, 450 ~12, 941 15 Ky. 10,190 17,740 27,950 18,555 - 9,555 16... La. .. - .. .. - 17 Me. . ~ 11,500 11,500 11,500 - 18 M1. 1,200 5,000 7,200 7,800 1. 600 19 Mass. 3,320 12,840 16,160 12,990 - 3,170 .20 Mich. 5,500 15,000 20,500 20,530 - 400 21 Minn 24,189 12, 5,213 35,712 40‘ , 859 + 4, 157 22 Miss 5,000 6,25 11,250 6,250 - 5,000 25 M0. 9,100 4,480 15,550 12,880 — 700 24 Mont. 7,480 8,100 15,580 19,000 .4 3,420 25 nebr. 7,700 8,555 15,255 14,550 — 1,925 26 Nev. — 11,200 11,200 11,200 - 27 . N. H. 9,000 8,761 9,661 8,761 - 900 28 ' N. J. 5,500 8,040 15,540 14,020 15 480 29 N. Mex. - 8,700 8,700 8,700 - 30 N. Y. 47,000 13,400 60,400 60,400 .. 31 'N. 0. 4,000 5,383 9,383 7,183 .- 2,200 32 N. :0. 19,100 8,850 27,950 22,775 .- 5,175 55 Ohio 2,500 ' 11,550 15,850 .27,400 —+15,550 34 Okla. 4,100 12,400 16,500 13,300 - 3,200 55 “ Oreg. 2,000 11,500 15,500 14,125 1. 825 36".: Penna - 10,000 10,080 10,437 + 437 37 R. I. - 6,130 6,100 6,100 - 38 S. 0. 4,500 13,000 12500 15,300 — 2,000 39 S. Dak. 11,000 15,630 26,630 15,630 ~11,000 40 Tenn. — 2,000 2,000 2,450 - + 4:50 41 Texas 8 , 000 10,900 18 , 900 2'1, 950 ' + 3 , 050 42, uwm 4AM) zmm’ ugwo 118m +12% 43 Vt. 1,150 9,000 10,500 ‘ 9,000 — 1,520 44 Va. 8,000 15,000 23,000 15,000 — 8,000 45 wash. 2,555 9,214 11,57 9,814 _ 1,755 46 W. Va. , 4,345 6,500 10,845 7,250 , 45,595 47 Wise. “27,000 5 ,55 32, 550 36,720 + 4, 170 48 Wye. $500 11,200 , 12,700 . 11,202 - 1,500 355,839 ‘457,588' 825,527 768,782 ' »-54,745 .1 Table III. - 459 Classification of states according to the amomt of funds used in agricultural economics research in 1926—7. Amount State 0 t No. of States a . Arizona, gap, Lg” Term. Q. g. u. on o. .- _L_e_s_§ then $ 5000 . 4 ; Alan Del._, 1:13., Md., 153.55., N. H. 550000—9399 : N.‘Me::icoLN. 0., R. I., W. : 12 : Colo. , 1410., 11:335., 110., Nebr., Nev. : 510,000-15999 : N.J., 01:13., Ore., Pa” Utah, Wye. : 12 : Ark” Conn., Fla., Ind” Kane. Ky. : @1000‘19,999 : Lionto 1 5.0.4 5.91:8» 1 V8.- : 10 304000441999 : Mich. , ILDalc. , Taxi; : 3 $25,000-29,999 : Calif” Ohio : 2 £30 1000—84 , 999 ; : O $35,000—39J999 : Wis, : 1 $10.0CO~44,999 : Iova,ginn. : 2 $454000-49.999 ‘ : ': 0 3&0000-544999 Y ; Ill. ; 1 3554000491999 : F. o $60,000-641999 : New York : 1 * No funds. 460 Table IV. The Distribution of Support for Agricultural Economics Research, 1924-25. (35 states) State : Funds : State : : Cumulative : 1924—5 Cumulative : Percent : Percent : No He 3 $ 900 5 025 : 2.78 AFR. : 1100 : .56 : 5.56 M3. : 1200 : .90 : 8.33 Vt. : 1500 : ~1.82 : 11.11 Wyn. : 1500 : 1.74 : 13.89 Ore. : 2000 : 2.30 : 16.67 Wash. : 2365 : 2.97 : 19.44 Ohio : 2500 : 3.57 : 22.32 Mass. : 3320 : 4.60 : 24.50 N. C. : 4000 : 5.73 : 27.78 Ga. : 4000 : 6.85 : 30.55 Okla. : 4100 : 8.00 : 33.33 Utah : 4100 : 9.16 : 36.11 W. Va. : 4345 : 10.38 : 38.89 S. C. : 4500 : 11.54 : 41.66 M158. : 5000 : 13.05 : 44.43 N. J. : 5500 : 14.59 : 47.82 Mich. : 5600 : 16.17 : 50.00 Conn. : 6500 : 17.99 : 53.78 Mont. : 7480 : 20.09 : 55.55 Nebr. : 7700 : 22.26 : 58.33 Texas : 8000 : 24.51 : 61.10 Va. : 8000 : 6.75 : 63.88 Me. : 9100 : 29.31 : 66.66 Ind. : 9887 : 33.99 : 69.44 0010. : 1000 : 34.90 : 72.22 . Ky. : 10190 : 37.76 : 75.00 S. Ink; : 11000 : 40.85 : 77.78 Kane. : 16500 : 45.49 : 80.55 N. de. : 19100 : 50.85 : 83.33 Calif. : 20000 : 56.48 : 86.10 Ill. : 23863 : 63.18 : 83.89 Minn. : 21189 : 59.97 : 91.66 Wis. : ZTCOO : 77.57 : 94.41 Iowa : 32800 : 86.78 : 97.23 N. Y. ' 47000 : 100.00 : 100.00 Table V.- The Distribtuion of quport for kgricultural Economics Research, 1926—7.(46 states) 461 40869 : Eunds : States- State : 1926-7 : Cumulative : Cumulative : : Percent : Percent Tenn. : $ 2450 : .32 : 2.17 Ga. : 4200 : .86 : 4.35 Del. : 5800 : 1.61 : 6.52 If§ho : ‘5807 : 2.3 : 8.69 R. I. : 6100 : 3.17 : 10.86 Miss. : 6250 : 3.92 : 13.04 N. C. : 7183 : 4.91 : 15.21 W. Va. : 7250 : 5.85 : 17.38 Ala. : 7500 : 6.83 : 19.56 Md. : 7600 : 7.84 : 21.73 N. MeX. .3 8700 : 6.90 : 23.90 N. H. : 8761 : 10.14 : 26.07 Vt. : 9000 : 11.28 : 28.25 Wash. : 9814 : 12.55 : 30.42 Pa. : 10437 : 13.92 : 32.59 Nev. : 11200 : 15.37 : 34.77 Wyo. : 11200 : 16.83 : 36.94 Maine : 11500 : 18.33 : 39.11 Mo. : M 12880 : 19.99 : 41.29 Masa. : 13990 : 21.69 : 43.46 Okla. : 13300 : 23.42 : 45.63 Utah : 13600 : 25.18 : 47.80 0010.. : 13700 ; 25.95 ; 49.98 N.J. : 14020 g 28.79 : 52.15 Ore. : 14125 : 30.62 : 54.32 Nebr. : 14330 : 32.49 : 56.50 Va. : 15000 : 34.44 : 58.67 S: G. : 15500 : 36.45 3 60.84 Indy : 15600 : 38.48 : 63.01 S. de. : 15630 : 40.51 : 65.19 Conn. : 16000 : 42.59 : 67.36 Fla. : 16050 : 44.68 : 69.54 Kane. : 16450 : 46.83 : 71.07 Ky. : 18565 : 49.23 : 73.88 'Mbnt. : 19000 : 51.70 : 76.05 Ark. : 19550 : 54.24 : 78.22 Mich. : 20200 : 56.87 : 80.40 TeXas : 21950 : 59.72 : 82.57 N.-luk; : 22775 : 52.68 : 84.74 Calif. : 27000 : 66.19 : 86.92 Ohio : 27400 : 69.75 : 89.09 Wise. : 36720 : 74.53 : 91.26 Iowa : 40240 : 79.75 : 93.44 Minn. : : :' 95.61 462 1926-7, this same percentage of the states spent about 14 percent of the total. For half of the states, the comparison at the tile periods Would be 16 and 27 percent; for two—thirds of the states, 29 and 41 percent; " for 80 percent of the states, 45 and 57 percent. There is still a pronounced concentration in a few states; and a pronounced opposite of this in many others. Tables VI and VII introduce some other aspects of the problem. First, in 1926-7, in thirty states the economics research Was receiving '70 percent or over of support out of the Purnell funds. Second, in this year the ecjonomics research in' thirty-one states Was receiving 40 percent or less 'of the total Purnell funds available. If further ex— ‘ pension is to be‘made, it 'must apparently be out of the Purnell funds. Exit further expansion from the Pumell fund- is becoming difficult because a large part of the funds have already gone into other lines of work. There is still time, however, to save some of the remaining increases; Administration-,.* One difficulty. in the selection of-the l. glection of research personnel arises, out of the fact- reseagch staff. “that the persons finally responsible for selecting it are not acquainted with the field of'agricultdral economics. With new departments and new lines of work being established, the dean or director becomes largely responsible for the selection of the personnel. Unless he has had spec‘ial training in economics, or has broadened his outlcSOk by special study, he is not always in‘a position to pass judgment on the fitness of a candidate for the head of a department or as a member' of the department's staff. .In the past, most of the workers chosen were men who had been trained primarily for some other field than that of economics. And even now, many men are chosen for the research staff who.are not, accurately speaking, economists. Aside from this difficulty of obtaining qualified persons to fill available positions, the selection of the research Workers is af— fected by the research program. Without a definite program the selec- tion of workers is likely to be haphazard. This indiscriminate adding of workers will discredit, and no doubt has discredited the work in this field. A well developed program of research would indicate which prob-- lems were more important in the larger run and hence indicate the type of man that should be added to the staff. . In some of our land-grant colleges, there is a decided tendency to select the research personnel from graduates of the same department. In some institutions, it has been the practice for the department of agri- _ cultural economics to look over the records of Imdergraduates and offer _ part-time work to several of the best of these each year. This gives the *By G. W. Forster. 463 Table VI.— Classification of states according to the percent the Purnell support is of the total support provided for agricul- tural economics research, 1926-7. Percent : , Purnell is : : No. of of Total : States - Funds ; : States 1926-7. : : Less than 10 : Arizona,“ G5,." La.‘ 1 3 10—19.9 : . Iowa, ‘Wi sconsin : 2 ISO-29.9 : Ind.., N. Y. : 2 30.39.29 : Minuet” N. oak. : 3 40-493 : 111;, Mont., Ohio, Texas : 4 50—59.9 : Calif., Nebr., N. J., Utah : 4 60,69.9 : : ‘0 70,79.9 : Kans., McL, Mich” N. C. 3 4 80—89.9 : 0019., Ore., S. C., Tenn., W. Va. : 5 : Ala., Ark” Conn.,Del., Fla., Ida., 'Ky. = 90-100 S: Mm, Mass., Miss” Nev., N.H., N. Mex., : 21 : Okla.., Pa.., R.I., S.Dak., Vt.,Va., Wash. : : Wyo. : Total 48 *No funds 464 Table VIL- Classification of the states according to the percentage of the Purnell fund available for 1926-7, spent on agricultural economics research.(1) Percent : : No. of of Total : State : Parnell : : States Available : : Less than 10 : 'Ariz.*. Ga.Lfi;e., Tenn. : 4 10-1939 : D61.) Idaol Ind. ’ 10%) MO. , N00. .1718. : 7 3 Ala... Ill. , hide, ILI‘IiSS. , Lion-tn , NQHI , .NOJQ , : 20-29. 9 : N‘Dak. I R. I. ) Utall , 1"].Va. : ll : Me., Nebr., Nev., N.Mex., Ohio, 0re., Pa., : 30.303 : Texas” Vt. , Wash. . Wyo. : 11 : :Colo., Kane., Maes., Minn., N.Y., Okla., : 40—49. 9 : -S,C, : 7 50—59. 9 : Calif. , 00an g Fla. 3 KY. 1 MiChq g $190k. Va. : 7 60—69. 9 = Ark. : 1 70—79, 9 : z 0 80—89.9 : : 0 90.100 : ' : 0 Total 48 (1) Calculated on the basis of $30,000 per state available for each state * No funie at all 465 head of the department an opportunity to see how these men work at def. inite task and also gives the student an opportunity to learn more of the work of the department. After the student graduates from the university, he may be offered an assistantship or instructorship. Then a position of assistant professor is to be filled, it may be filled by further selection annng these assistants or instructors. This policy has certain obvious advantages, but it should not be pushed too far. The dangers from it are relieved somefihat in a case of large institu- tions where a great many of the graduate students are drawn from differ- ent parts of the country and from other institutions. But even here, it produces narrowness and lack of vision if carried too far. In the case of smaller institutions, where the individual does not get an op— portunity to work under a number of able men, the practice is at its worst. 2. Erovision for A survey of the states reveals that training graduate seventeen out of thirty—eight states re~ students. porting do not provide any financial assistance Whatsoever. Fellowships and scholarships are scarce; only eleven states report scholarships or fellowships for graduate students. Assistantships were available in ten of the institutions reporting. Many of these as- sistantships were for part—time graduate students. There are very few stipends which permit the individuals to devote all of their time to graduate work. In a large number of the institutions, there is little opportunity for graduate study; because the institutions are small, the departments of economics and agricultural economics are relatively weak and the number of graduates is so few that it is not worth While to de- velop anything in the nature of a graduate course. Several of the states reported such things as weekly meetings to discuss problems in the field; others stated that they urged their assiStants and research Workers to take graduate work at other institutions, usually without pay or any kind of compensation. Others urged more study, etc., While the last thing that should be urged is that each department of economics should attempt to develop strong graduate instruction, nevertheless as umch as possible should be done with the resources at hand to improve the training of the young research workers who are on the staff; and beyond this, to give them a chance to gp elsewhere for further training. 0n the whole, the situation with resPect to opportunities for graduate study is not al- together good, and the training of research workers is preceding too slowly. ‘ ' ' 3. Sabbatical leave: A survey of the state institutions as to the matter of sabbatical leave in- , dicates that a large majority of the‘in- stitutions do not grant any such leave Whatsoever. Thirty-eight states replied to a question on this subject. There are twentystwo which do not have any provision for sabbatiCal leave. Five occasionally give' leaves of absence, four of these without any definite plan. Twelve 466 grant sabbatical leave under certain conditions. There is apparently no assurance, however, that a worker will be granted a leave of absence. Usually the conditions are quite rigid, and the leave is granted only in special cases with the consent of the board of trustees or the president or both. In one state (Mass.), leave of absence is forbidden. In another institution it is apparently pessible to get a leaye of absence by working during the summer and in this way accumulate a sufficient credit so that a leave of absence may be granted on full pay (Wis.). This, however, cannot be considered a leave of absence inasmuch as the person has already put in the time necessary for it. On the whole, the situation with regard to sabbatical leave is not encouraging, nor is it likely to become so, unless some effort is put forth to get the institus tions to realize more fully the necessity for granting leave for further study. 4. Term of service: The term of service imposed upon ’ research workers by most institutions is excessive and on the whole is bad for both the workers and the research agency. For example, in 1926—27,-out of 288 workers engaged in economic research in agriculture, 204, or 71 percent were employed on the eleven or twelve- month basis. Employment on this basis restricts the worker very much, sepecially with respect to obtaining additional training and experience. If the worker wishes to pursue graduate studies or acquire additional training, he must obtain leave of absence from the institution or resign his position.. This is especially serious in view of the fact that a large majority of the institutions in which research is being conducted cannot provide resident graduate work of an outstanding character. Ebr- thermors, it is often impossible for the workers to take graduate work even if the institution is in a position to give such training; for_in order to do so he is obliged to work on a part-time basis and thus re— duce his salary Which is in most cases already too low. Frequently there is discrimination against the research work in favor of those who are engaged in teaching exclusively. In many institutions the teaching-staff is on a nine-month basis. This gives teachers ample opportunity to attend summer school, and thus over a . period of years to obtain sufficient graduate credit to obtain their advanced degree. Apparently, it has been recognized that in the case of teachers, further study and training is necessary. Similar recogni- tion should be given to research workers. There is need also for a more definite plan of promotion of research workers. Complaint is often heard among research workers in this field that there is no opportunity for advancement. Provision for this would not only increase the morale of the workers but also aid in retaining the services of the young men with ability.